|
1 |
| -# Build Mega Service of AudioQnA on Xeon |
| 1 | +# Deploying AudioQnA on Intel® Xeon® Processors |
2 | 2 |
|
3 |
| -This document outlines the deployment process for a AudioQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel Xeon server. |
4 |
| - |
5 |
| -The default pipeline deploys with vLLM as the LLM serving component. It also provides options of using TGI backend for LLM microservice, please refer to [Start the MegaService](#-start-the-megaservice) section in this page. |
| 3 | +This document outlines the single node deployment process for a AudioQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservices on Intel Xeon server. The steps include pulling Docker images, container deployment via Docker Compose, and service execution using microservices `llm`. |
6 | 4 |
|
7 | 5 | Note: The default LLM is `meta-llama/Meta-Llama-3-8B-Instruct`. Before deploying the application, please make sure either you've requested and been granted the access to it on [Huggingface](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) or you've downloaded the model locally from [ModelScope](https://www.modelscope.cn/models).
|
8 | 6 |
|
9 |
| -## 🚀 Build Docker images |
| 7 | +## Table of Contents |
10 | 8 |
|
11 |
| -### 1. Source Code install GenAIComps |
| 9 | +1. [AudioQnA Quick Start Deployment](#audioqna-quick-start-deployment) |
| 10 | +2. [AudioQnA Docker Compose Files](#audioqna-docker-compose-files) |
| 11 | +3. [Validate Microservices](#validate-microservices) |
| 12 | +4. [Conclusion](#conclusion) |
12 | 13 |
|
13 |
| -```bash |
14 |
| -git clone https://github.com/opea-project/GenAIComps.git |
15 |
| -cd GenAIComps |
16 |
| -``` |
| 14 | +## AudioQnA Quick Start Deployment |
17 | 15 |
|
18 |
| -### 2. Build ASR Image |
| 16 | +This section describes how to quickly deploy and test the AudioQnA service manually on an Intel® Xeon® processor. The basic steps are: |
19 | 17 |
|
20 |
| -```bash |
21 |
| -docker build -t opea/whisper:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/third_parties/whisper/src/Dockerfile . |
22 |
| -``` |
| 18 | +1. [Access the Code](#access-the-code) |
| 19 | +2. [Configure the Deployment Environment](#configure-the-deployment-environment) |
| 20 | +3. [Deploy the Services Using Docker Compose](#deploy-the-services-using-docker-compose) |
| 21 | +4. [Check the Deployment Status](#check-the-deployment-status) |
| 22 | +5. [Validate the Pipeline](#validate-the-pipeline) |
| 23 | +6. [Cleanup the Deployment](#cleanup-the-deployment) |
23 | 24 |
|
24 |
| -### 3. Build vLLM Image |
| 25 | +### Access the Code |
| 26 | + |
| 27 | +Clone the GenAIExample repository and access the AudioQnA Intel® Xeon® platform Docker Compose files and supporting scripts: |
25 | 28 |
|
26 | 29 | ```bash
|
27 |
| -git clone https://github.com/vllm-project/vllm.git |
28 |
| -cd ./vllm/ |
29 |
| -VLLM_VER="v0.8.3" |
30 |
| -git checkout ${VLLM_VER} |
31 |
| -docker build --no-cache --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f docker/Dockerfile.cpu -t opea/vllm:latest --shm-size=128g . |
| 30 | +git clone https://github.com/opea-project/GenAIExamples.git |
| 31 | +cd GenAIExamples/AudioQnA |
32 | 32 | ```
|
33 | 33 |
|
34 |
| -### 4. Build TTS Image |
| 34 | +Then checkout a released version, such as v1.2: |
35 | 35 |
|
36 | 36 | ```bash
|
37 |
| -docker build -t opea/speecht5:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/third_parties/speecht5/src/Dockerfile . |
38 |
| - |
39 |
| -# multilang tts (optional) |
40 |
| -docker build -t opea/gpt-sovits:latest --build-arg http_proxy=$http_proxy --build-arg https_proxy=$https_proxy -f comps/third_parties/gpt-sovits/src/Dockerfile . |
| 37 | +git checkout v1.2 |
41 | 38 | ```
|
42 | 39 |
|
43 |
| -### 5. Build MegaService Docker Image |
| 40 | +### Configure the Deployment Environment |
44 | 41 |
|
45 |
| -To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `audioqna.py` Python script. Build the MegaService Docker image using the command below: |
| 42 | +To set up environment variables for deploying AudioQnA services, set up some parameters specific to the deployment environment and source the `set_env.sh` script in this directory: |
46 | 43 |
|
47 | 44 | ```bash
|
48 |
| -git clone https://github.com/opea-project/GenAIExamples.git |
49 |
| -cd GenAIExamples/AudioQnA/ |
50 |
| -docker build --no-cache -t opea/audioqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile . |
| 45 | +export host_ip="External_Public_IP" # ip address of the node |
| 46 | +export HUGGINGFACEHUB_API_TOKEN="Your_HuggingFace_API_Token" |
| 47 | +export http_proxy="Your_HTTP_Proxy" # http proxy if any |
| 48 | +export https_proxy="Your_HTTPs_Proxy" # https proxy if any |
| 49 | +export no_proxy=localhost,127.0.0.1,$host_ip,whisper-service,speecht5-service,vllm-service,tgi-service,audioqna-xeon-backend-server,audioqna-xeon-ui-server # additional no proxies if needed |
| 50 | +export NGINX_PORT=${your_nginx_port} # your usable port for nginx, 80 for example |
| 51 | +source ./set_env.sh |
51 | 52 | ```
|
52 | 53 |
|
53 |
| -Then run the command `docker images`, you will have following images ready: |
54 |
| - |
55 |
| -1. `opea/whisper:latest` |
56 |
| -2. `opea/vllm:latest` |
57 |
| -3. `opea/speecht5:latest` |
58 |
| -4. `opea/audioqna:latest` |
59 |
| -5. `opea/gpt-sovits:latest` (optional) |
| 54 | +Consult the section on [AudioQnA Service configuration](#audioqna-configuration) for information on how service specific configuration parameters affect deployments. |
60 | 55 |
|
61 |
| -## 🚀 Set the environment variables |
| 56 | +### Deploy the Services Using Docker Compose |
62 | 57 |
|
63 |
| -Before starting the services with `docker compose`, you have to recheck the following environment variables. |
| 58 | +To deploy the AudioQnA services, execute the `docker compose up` command with the appropriate arguments. For a default deployment, execute the command below. It uses the 'compose.yaml' file. |
64 | 59 |
|
65 | 60 | ```bash
|
66 |
| -export host_ip=<your External Public IP> # export host_ip=$(hostname -I | awk '{print $1}') |
67 |
| -export HUGGINGFACEHUB_API_TOKEN=<your HF token> |
| 61 | +cd docker_compose/intel/cpu/xeon |
| 62 | +docker compose -f compose.yaml up -d |
| 63 | +``` |
68 | 64 |
|
69 |
| -export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct" |
| 65 | +> **Note**: developers should build docker image from source when: |
| 66 | +> |
| 67 | +> - Developing off the git main branch (as the container's ports in the repo may be different > from the published docker image). |
| 68 | +> - Unable to download the docker image. |
| 69 | +> - Use a specific version of Docker image. |
70 | 70 |
|
71 |
| -export MEGA_SERVICE_HOST_IP=${host_ip} |
72 |
| -export WHISPER_SERVER_HOST_IP=${host_ip} |
73 |
| -export SPEECHT5_SERVER_HOST_IP=${host_ip} |
74 |
| -export LLM_SERVER_HOST_IP=${host_ip} |
75 |
| -export GPT_SOVITS_SERVER_HOST_IP=${host_ip} |
| 71 | +Please refer to the table below to build different microservices from source: |
76 | 72 |
|
77 |
| -export WHISPER_SERVER_PORT=7066 |
78 |
| -export SPEECHT5_SERVER_PORT=7055 |
79 |
| -export GPT_SOVITS_SERVER_PORT=9880 |
80 |
| -export LLM_SERVER_PORT=3006 |
| 73 | +| Microservice | Deployment Guide | |
| 74 | +| ------------ | --------------------------------------------------------------------------------------------------------------------------------- | |
| 75 | +| vLLM | [vLLM build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/third_parties/vllm#build-docker) | |
| 76 | +| LLM | [LLM build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/llms) | |
| 77 | +| WHISPER | [Whisper build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/asr/src#211-whisper-server-image) | |
| 78 | +| SPEECHT5 | [SpeechT5 build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/tts/src#211-speecht5-server-image) | |
| 79 | +| GPT-SOVITS | [GPT-SOVITS build guide](https://github.com/opea-project/GenAIComps/tree/main/comps/third_parties/gpt-sovits/src#build-the-image) | |
| 80 | +| MegaService | [MegaService build guide](../../../../README_miscellaneous.md#build-megaservice-docker-image) | |
| 81 | +| UI | [Basic UI build guide](../../../../README_miscellaneous.md#build-ui-docker-image) | |
81 | 82 |
|
82 |
| -export BACKEND_SERVICE_ENDPOINT=http://${host_ip}:3008/v1/audioqna |
83 |
| -``` |
| 83 | +### Check the Deployment Status |
84 | 84 |
|
85 |
| -or use set_env.sh file to setup environment variables. |
| 85 | +After running docker compose, check if all the containers launched via docker compose have started: |
86 | 86 |
|
87 |
| -Note: |
| 87 | +```bash |
| 88 | +docker ps -a |
| 89 | +``` |
88 | 90 |
|
89 |
| -- Please replace with host_ip with your external IP address, do not use localhost. |
90 |
| -- If you are in a proxy environment, also set the proxy-related environment variables: |
| 91 | +For the default deployment, the following 5 containers should have started: |
91 | 92 |
|
92 | 93 | ```
|
93 |
| -export http_proxy="Your_HTTP_Proxy" |
94 |
| -export https_proxy="Your_HTTPs_Proxy" |
95 |
| -# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1" |
96 |
| -export no_proxy="Your_No_Proxy",${host_ip},whisper-service,speecht5-service,gpt-sovits-service,tgi-service,vllm-service,audioqna-xeon-backend-server,audioqna-xeon-ui-server |
| 94 | +1c67e44c39d2 opea/audioqna-ui:latest "docker-entrypoint.s…" About a minute ago Up About a minute 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp audioqna-xeon-ui-server |
| 95 | +833a42677247 opea/audioqna:latest "python audioqna.py" About a minute ago Up About a minute 0.0.0.0:3008->8888/tcp, :::3008->8888/tcp audioqna-xeon-backend-server |
| 96 | +5dc4eb9bf499 opea/speecht5:latest "python speecht5_ser…" About a minute ago Up About a minute 0.0.0.0:7055->7055/tcp, :::7055->7055/tcp speecht5-service |
| 97 | +814e6efb1166 opea/vllm:latest "python3 -m vllm.ent…" About a minute ago Up About a minute (healthy) 0.0.0.0:3006->80/tcp, :::3006->80/tcp vllm-service |
| 98 | +46f7a00f4612 opea/whisper:latest "python whisper_serv…" About a minute ago Up About a minute 0.0.0.0:7066->7066/tcp, :::7066->7066/tcp whisper-service |
97 | 99 | ```
|
98 | 100 |
|
99 |
| -## 🚀 Start the MegaService |
| 101 | +If any issues are encountered during deployment, refer to the [Troubleshooting](../../../../README_miscellaneous.md#troubleshooting) section. |
| 102 | + |
| 103 | +### Validate the Pipeline |
| 104 | + |
| 105 | +Once the AudioQnA services are running, test the pipeline using the following command: |
100 | 106 |
|
101 | 107 | ```bash
|
102 |
| -cd GenAIExamples/AudioQnA/docker_compose/intel/cpu/xeon/ |
103 |
| -``` |
| 108 | +# Test the AudioQnA megaservice by recording a .wav file, encoding the file into the base64 format, and then sending the base64 string to the megaservice endpoint. |
| 109 | +# The megaservice will return a spoken response as a base64 string. To listen to the response, decode the base64 string and save it as a .wav file. |
| 110 | +wget https://github.com/intel/intel-extension-for-transformers/raw/refs/heads/main/intel_extension_for_transformers/neural_chat/assets/audio/sample_2.wav |
| 111 | +base64_audio=$(base64 -w 0 sample_2.wav) |
104 | 112 |
|
105 |
| -If use vLLM as the LLM serving backend: |
| 113 | +# if you are using speecht5 as the tts service, voice can be "default" or "male" |
| 114 | +# if you are using gpt-sovits for the tts service, you can set the reference audio following https://github.com/opea-project/GenAIComps/blob/main/comps/third_parties/gpt-sovits/src/README.md |
106 | 115 |
|
| 116 | +curl http://${host_ip}:3008/v1/audioqna \ |
| 117 | + -X POST \ |
| 118 | + -H "Content-Type: application/json" \ |
| 119 | + -d "{\"audio\": \"${base64_audio}\", \"max_tokens\": 64, \"voice\": \"default\"}" \ |
| 120 | + | sed 's/^"//;s/"$//' | base64 -d > output.wav |
107 | 121 | ```
|
108 |
| -docker compose up -d |
109 | 122 |
|
110 |
| -# multilang tts (optional) |
111 |
| -docker compose -f compose_multilang.yaml up -d |
112 |
| -``` |
| 123 | +**Note** : Access the AudioQnA UI by web browser through this URL: `http://${host_ip}:5173`. Please confirm the `5173` port is opened in the firewall. To validate each microservice used in the pipeline refer to the [Validate Microservices](#validate-microservices) section. |
113 | 124 |
|
114 |
| -If use TGI as the LLM serving backend: |
| 125 | +### Cleanup the Deployment |
115 | 126 |
|
| 127 | +To stop the containers associated with the deployment, execute the following command: |
| 128 | + |
| 129 | +```bash |
| 130 | +docker compose -f compose.yaml down |
116 | 131 | ```
|
117 |
| -docker compose -f compose_tgi.yaml up -d |
118 |
| -``` |
119 | 132 |
|
120 |
| -## 🚀 Test MicroServices |
| 133 | +## AudioQnA Docker Compose Files |
| 134 | + |
| 135 | +In the context of deploying an AudioQnA pipeline on an Intel® Xeon® platform, we can pick and choose different large language model serving frameworks, or single English TTS/multi-language TTS component. The table below outlines the various configurations that are available as part of the application. These configurations can be used as templates and can be extended to different components available in [GenAIComps](https://github.com/opea-project/GenAIComps.git). |
| 136 | + |
| 137 | +| File | Description | |
| 138 | +| -------------------------------------------------- | ----------------------------------------------------------------------------------------- | |
| 139 | +| [compose.yaml](./compose.yaml) | Default compose file using vllm as serving framework and redis as vector database | |
| 140 | +| [compose_tgi.yaml](./compose_tgi.yaml) | The LLM serving framework is TGI. All other configurations remain the same as the default | |
| 141 | +| [compose_multilang.yaml](./compose_multilang.yaml) | The TTS component is GPT-SoVITS. All other configurations remain the same as the default | |
| 142 | + |
| 143 | +## Validate MicroServices |
121 | 144 |
|
122 | 145 | 1. Whisper Service
|
123 | 146 |
|
@@ -161,25 +184,14 @@ docker compose -f compose_tgi.yaml up -d
|
161 | 184 |
|
162 | 185 | 3. TTS Service
|
163 | 186 |
|
164 |
| - ``` |
| 187 | + ```bash |
165 | 188 | # speecht5 service
|
166 | 189 | curl http://${host_ip}:${SPEECHT5_SERVER_PORT}/v1/audio/speech -XPOST -d '{"input": "Who are you?"}' -H 'Content-Type: application/json' --output speech.mp3
|
167 | 190 |
|
168 | 191 | # gpt-sovits service (optional)
|
169 | 192 | curl http://${host_ip}:${GPT_SOVITS_SERVER_PORT}/v1/audio/speech -XPOST -d '{"input": "Who are you?"}' -H 'Content-Type: application/json' --output speech.mp3
|
170 | 193 | ```
|
171 | 194 |
|
172 |
| -## 🚀 Test MegaService |
173 |
| - |
174 |
| -Test the AudioQnA megaservice by recording a .wav file, encoding the file into the base64 format, and then sending the |
175 |
| -base64 string to the megaservice endpoint. The megaservice will return a spoken response as a base64 string. To listen |
176 |
| -to the response, decode the base64 string and save it as a .wav file. |
| 195 | +## Conclusion |
177 | 196 |
|
178 |
| -```bash |
179 |
| -# if you are using speecht5 as the tts service, voice can be "default" or "male" |
180 |
| -# if you are using gpt-sovits for the tts service, you can set the reference audio following https://github.com/opea-project/GenAIComps/blob/main/comps/third_parties/gpt-sovits/src/README.md |
181 |
| -curl http://${host_ip}:3008/v1/audioqna \ |
182 |
| - -X POST \ |
183 |
| - -d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64, "voice":"default"}' \ |
184 |
| - -H 'Content-Type: application/json' | sed 's/^"//;s/"$//' | base64 -d > output.wav |
185 |
| -``` |
| 197 | +This guide should enable developers to deploy the default configuration or any of the other compose yaml files for different configurations. It also highlights the configurable parameters that can be set before deployment. |
0 commit comments