Skip to content

Commit 39abef8

Browse files
SearchQnA App - Adding files to deploy SearchQnA application on AMD GPU (#1193)
Signed-off-by: Chingis Yundunov <YundunovCN@sibedge.com>
1 parent ed16308 commit 39abef8

File tree

4 files changed

+525
-0
lines changed

4 files changed

+525
-0
lines changed
Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
# Build and deploy SearchQnA Application on AMD GPU (ROCm)
2+
3+
## Build images
4+
5+
### Build Embedding Image
6+
7+
```bash
8+
git clone https://github.com/opea-project/GenAIComps.git
9+
cd GenAIComps
10+
docker build --no-cache -t opea/embedding-tei:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/embeddings/src/Dockerfile .
11+
```
12+
13+
### Build Retriever Image
14+
15+
```bash
16+
docker build --no-cache -t opea/web-retriever-chroma:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/web_retrievers/src/Dockerfile .
17+
```
18+
19+
### Build Rerank Image
20+
21+
```bash
22+
docker build --no-cache -t opea/reranking-tei:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/rerankings/src/Dockerfile .
23+
```
24+
25+
### Build the LLM Docker Image
26+
27+
```bash
28+
docker build -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/src/text-generation/Dockerfile .
29+
```
30+
31+
### Build the MegaService Docker Image
32+
33+
```bash
34+
git clone https://github.com/opea-project/GenAIExamples.git
35+
cd GenAIExamples/SearchQnA
36+
docker build --no-cache -t opea/searchqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
37+
```
38+
39+
### Build the UI Docker Image
40+
41+
```bash
42+
cd GenAIExamples/SearchQnA/ui
43+
docker build --no-cache -t opea/opea/searchqna-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile .
44+
```
45+
46+
## Deploy SearchQnA Application
47+
48+
### Features of Docker compose for AMD GPUs
49+
50+
1. Added forwarding of GPU devices to the container TGI service with instructions:
51+
52+
```yaml
53+
shm_size: 1g
54+
devices:
55+
- /dev/kfd:/dev/kfd
56+
- /dev/dri/:/dev/dri/
57+
cap_add:
58+
- SYS_PTRACE
59+
group_add:
60+
- video
61+
security_opt:
62+
- seccomp:unconfined
63+
```
64+
65+
In this case, all GPUs are thrown. To reset a specific GPU, you need to use specific device names cardN and renderN.
66+
67+
For example:
68+
69+
```yaml
70+
shm_size: 1g
71+
devices:
72+
- /dev/kfd:/dev/kfd
73+
- /dev/dri/card0:/dev/dri/card0
74+
- /dev/dri/render128:/dev/dri/render128
75+
cap_add:
76+
- SYS_PTRACE
77+
group_add:
78+
- video
79+
security_opt:
80+
- seccomp:unconfined
81+
```
82+
83+
To find out which GPU device IDs cardN and renderN correspond to the same GPU, use the GPU driver utility
84+
85+
### Go to the directory with the Docker compose file
86+
87+
```bash
88+
cd GenAIExamples/SearchQnA/docker_compose/amd/gpu/rocm
89+
```
90+
91+
### Set environments
92+
93+
In the file "GenAIExamples/SearchQnA/docker_compose/amd/gpu/rocm/set_env.sh " it is necessary to set the required values. Parameter assignments are specified in the comments for each variable setting command
94+
95+
```bash
96+
chmod +x set_env.sh
97+
. set_env.sh
98+
```
99+
100+
### Run services
101+
102+
```
103+
docker compose up -d
104+
```
105+
106+
# Validate the MicroServices and MegaService
107+
108+
## Validate TEI service
109+
110+
```bash
111+
curl http://${SEARCH_HOST_IP}:3001/embed \
112+
-X POST \
113+
-d '{"inputs":"What is Deep Learning?"}' \
114+
-H 'Content-Type: application/json'
115+
```
116+
117+
## Validate Embedding service
118+
119+
```bash
120+
curl http://${SEARCH_HOST_IP}:3002/v1/embeddings\
121+
-X POST \
122+
-d '{"text":"hello"}' \
123+
-H 'Content-Type: application/json'
124+
```
125+
126+
## Validate Web Retriever service
127+
128+
```bash
129+
export your_embedding=$(python3 -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
130+
curl http://${SEARCH_HOST_IP}:3003/v1/web_retrieval \
131+
-X POST \
132+
-d "{\"text\":\"What is the 2024 holiday schedule?\",\"embedding\":${your_embedding}}" \
133+
-H 'Content-Type: application/json'
134+
```
135+
136+
## Validate TEI Reranking service
137+
138+
```bash
139+
curl http://${SEARCH_HOST_IP}:3004/rerank \
140+
-X POST \
141+
-d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \
142+
-H 'Content-Type: application/json'
143+
```
144+
145+
## Validate Reranking service
146+
147+
```bash
148+
curl http://${SEARCH_HOST_IP}:3005/v1/reranking\
149+
-X POST \
150+
-d '{"initial_query":"What is Deep Learning?", "retrieved_docs": [{"text":"Deep Learning is not..."}, {"text":"Deep learning is..."}]}' \
151+
-H 'Content-Type: application/json'
152+
```
153+
154+
## Validate TGI service
155+
156+
```bash
157+
curl http://${SEARCH_HOST_IP}:3006/generate \
158+
-X POST \
159+
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \
160+
-H 'Content-Type: application/json'
161+
```
162+
163+
## Validate LLM service
164+
165+
```bash
166+
curl http://${SEARCH_HOST_IP}:3007/v1/chat/completions\
167+
-X POST \
168+
-d '{"query":"What is Deep Learning?","max_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \
169+
-H 'Content-Type: application/json'
170+
```
171+
172+
## Validate MegaService
173+
174+
```bash
175+
curl http://${SEARCH_HOST_IP}:3008/v1/searchqna -H "Content-Type: application/json" -d '{
176+
"messages": "What is the latest news? Give me also the source link.",
177+
"stream": "True"
178+
}'
179+
```
Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
# Copyright (C) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
# SPDX-License-Identifier: Apache-2.0
5+
6+
services:
7+
search-tei-embedding-service:
8+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
9+
container_name: search-tei-embedding-server
10+
ports:
11+
- "3001:80"
12+
volumes:
13+
- "./data:/data"
14+
shm_size: 1g
15+
environment:
16+
no_proxy: ${no_proxy}
17+
http_proxy: ${http_proxy}
18+
https_proxy: ${https_proxy}
19+
HF_TOKEN: ${SEARCH_HUGGINGFACEHUB_API_TOKEN}
20+
HUGGING_FACE_HUB_TOKEN: ${SEARCH_HUGGINGFACEHUB_API_TOKEN}
21+
HUGGINGFACEHUB_API_TOKEN: ${SEARCH_HUGGINGFACEHUB_API_TOKEN}
22+
command: --model-id ${SEARCH_EMBEDDING_MODEL_ID} --auto-truncate
23+
search-embedding:
24+
image: ${REGISTRY:-opea}/embedding:${TAG:-latest}
25+
container_name: search-embedding-server
26+
depends_on:
27+
- search-tei-embedding-service
28+
ports:
29+
- "3002:6000"
30+
ipc: host
31+
environment:
32+
no_proxy: ${no_proxy}
33+
http_proxy: ${http_proxy}
34+
https_proxy: ${https_proxy}
35+
TEI_EMBEDDING_HOST_IP: ${SEARCH_HOST_IP}
36+
TEI_EMBEDDING_ENDPOINT: ${SEARCH_TEI_EMBEDDING_ENDPOINT}
37+
HF_TOKEN: ${SEARCH_HUGGINGFACEHUB_API_TOKEN}
38+
restart: unless-stopped
39+
search-web-retriever:
40+
image: ${REGISTRY:-opea}/web-retriever:${TAG:-latest}
41+
container_name: search-web-retriever-server
42+
ports:
43+
- "3003:7077"
44+
ipc: host
45+
environment:
46+
no_proxy: ${no_proxy}
47+
http_proxy: ${http_proxy}
48+
https_proxy: ${https_proxy}
49+
TEI_EMBEDDING_ENDPOINT: ${SEARCH_TEI_EMBEDDING_ENDPOINT}
50+
GOOGLE_API_KEY: ${SEARCH_GOOGLE_API_KEY}
51+
GOOGLE_CSE_ID: ${SEARCH_GOOGLE_CSE_ID}
52+
restart: unless-stopped
53+
search-tei-reranking-service:
54+
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
55+
container_name: search-tei-reranking-server
56+
ports:
57+
- "3004:80"
58+
volumes:
59+
- "./data:/data"
60+
shm_size: 1g
61+
environment:
62+
no_proxy: ${no_proxy}
63+
http_proxy: ${http_proxy}
64+
https_proxy: ${https_proxy}
65+
command: --model-id ${SEARCH_RERANK_MODEL_ID} --auto-truncate
66+
search-reranking:
67+
image: ${REGISTRY:-opea}/reranking:${TAG:-latest}
68+
container_name: search-reranking-server
69+
depends_on:
70+
- search-tei-reranking-service
71+
ports:
72+
- "3005:8000"
73+
ipc: host
74+
environment:
75+
no_proxy: ${no_proxy}
76+
http_proxy: ${http_proxy}
77+
https_proxy: ${https_proxy}
78+
TEI_RERANKING_ENDPOINT: ${SEARCH_TEI_RERANKING_ENDPOINT}
79+
HF_TOKEN: ${SEARCH_HUGGINGFACEHUB_API_TOKEN}
80+
HUGGING_FACE_HUB_TOKEN: ${SEARCH_HUGGINGFACEHUB_API_TOKEN}
81+
HUGGINGFACEHUB_API_TOKEN: ${SEARCH_HUGGINGFACEHUB_API_TOKEN}
82+
restart: unless-stopped
83+
search-tgi-service:
84+
image: ghcr.io/huggingface/text-generation-inference:2.3.1-rocm
85+
container_name: search-tgi-service
86+
ports:
87+
- "3006:80"
88+
volumes:
89+
- "./data:/data"
90+
environment:
91+
no_proxy: ${no_proxy}
92+
http_proxy: ${http_proxy}
93+
https_proxy: ${https_proxy}
94+
HUGGING_FACE_HUB_TOKEN: ${SEARCH_HUGGINGFACEHUB_API_TOKEN}
95+
HUGGINGFACEHUB_API_TOKEN: ${SEARCH_HUGGINGFACEHUB_API_TOKEN}
96+
shm_size: 1g
97+
devices:
98+
- /dev/kfd:/dev/kfd
99+
- /dev/dri/:/dev/dri/
100+
cap_add:
101+
- SYS_PTRACE
102+
group_add:
103+
- video
104+
security_opt:
105+
- seccomp:unconfined
106+
ipc: host
107+
command: --model-id ${SEARCH_LLM_MODEL_ID} --max-input-length 1024 --max-total-tokens 2048
108+
search-llm:
109+
image: ${REGISTRY:-opea}/llm-textgen:${TAG:-latest}
110+
container_name: search-llm-server
111+
depends_on:
112+
- search-tgi-service
113+
ports:
114+
- "3007:9000"
115+
ipc: host
116+
environment:
117+
no_proxy: ${no_proxy}
118+
http_proxy: ${http_proxy}
119+
https_proxy: ${https_proxy}
120+
TGI_LLM_ENDPOINT: ${SEARCH_TGI_LLM_ENDPOINT}
121+
HUGGINGFACEHUB_API_TOKEN: ${SEARCH_HUGGINGFACEHUB_API_TOKEN}
122+
LLM_ENDPOINT: ${SEARCH_TGI_LLM_ENDPOINT}
123+
LLM_MODEL_ID: ${SEARCH_LLM_MODEL_ID}
124+
LLM_MODEL: ${SEARCH_LLM_MODEL_ID}
125+
HF_TOKEN: ${SEARCH_HUGGINGFACEHUB_API_TOKEN}
126+
OPENAI_API_KEY: ${SEARCH_OPENAI_API_KEY}
127+
restart: unless-stopped
128+
search-backend-server:
129+
image: ${REGISTRY:-opea}/searchqna:${TAG:-latest}
130+
container_name: search-backend-server
131+
depends_on:
132+
- search-tei-embedding-service
133+
- search-embedding
134+
- search-web-retriever
135+
- search-tei-reranking-service
136+
- search-reranking
137+
- search-tgi-service
138+
- search-llm
139+
ports:
140+
- "${SEARCH_BACKEND_SERVICE_PORT:-3008}:8888"
141+
environment:
142+
- no_proxy=${no_proxy}
143+
- https_proxy=${https_proxy}
144+
- http_proxy=${http_proxy}
145+
- MEGA_SERVICE_HOST_IP=${SEARCH_MEGA_SERVICE_HOST_IP}
146+
- EMBEDDING_SERVICE_HOST_IP=${SEARCH_EMBEDDING_SERVICE_HOST_IP}
147+
- WEB_RETRIEVER_SERVICE_HOST_IP=${SEARCH_WEB_RETRIEVER_SERVICE_HOST_IP}
148+
- RERANK_SERVICE_HOST_IP=${SEARCH_RERANK_SERVICE_HOST_IP}
149+
- LLM_SERVICE_HOST_IP=${SEARCH_LLM_SERVICE_HOST_IP}
150+
- EMBEDDING_SERVICE_PORT=${SEARCH_EMBEDDING_SERVICE_PORT}
151+
- WEB_RETRIEVER_SERVICE_PORT=${SEARCH_WEB_RETRIEVER_SERVICE_PORT}
152+
- RERANK_SERVICE_PORT=${SEARCH_RERANK_SERVICE_PORT}
153+
- LLM_SERVICE_PORT=${SEARCH_LLM_SERVICE_PORT}
154+
ipc: host
155+
restart: always
156+
search-ui-server:
157+
image: ${REGISTRY:-opea}/searchqna-ui:${TAG:-latest}
158+
container_name: search-ui-server
159+
depends_on:
160+
- search-backend-server
161+
ports:
162+
- "${SEARCH_FRONTEND_SERVICE_PORT:-5173}:5173"
163+
environment:
164+
- no_proxy=${no_proxy}
165+
- https_proxy=${https_proxy}
166+
- http_proxy=${http_proxy}
167+
- BACKEND_BASE_URL=${SEARCH_BACKEND_SERVICE_ENDPOINT}
168+
ipc: host
169+
restart: always
170+
171+
networks:
172+
default:
173+
driver: bridge
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
#!/usr/bin/env bash
2+
3+
# Copyright (C) 2024 Intel Corporation
4+
# SPDX-License-Identifier: Apache-2.0
5+
6+
# SPDX-License-Identifier: Apache-2.0
7+
8+
export SEARCH_HOST_IP=10.53.22.29
9+
export SEARCH_EXTERNAL_HOST_IP=68.69.180.77
10+
export SEARCH_EMBEDDING_MODEL_ID='BAAI/bge-base-en-v1.5'
11+
export SEARCH_TEI_EMBEDDING_ENDPOINT=http://${SEARCH_HOST_IP}:3001
12+
export SEARCH_RERANK_MODEL_ID='BAAI/bge-reranker-base'
13+
export SEARCH_TEI_RERANKING_ENDPOINT=http://${SEARCH_HOST_IP}:3004
14+
export SEARCH_HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
15+
export SEARCH_OPENAI_API_KEY=${OPENAI_API_KEY}
16+
17+
export SEARCH_TGI_LLM_ENDPOINT=http://${SEARCH_HOST_IP}:3006
18+
export SEARCH_LLM_MODEL_ID='Intel/neural-chat-7b-v3-3'
19+
20+
export SEARCH_MEGA_SERVICE_HOST_IP=${SEARCH_EXTERNAL_HOST_IP}
21+
export SEARCH_EMBEDDING_SERVICE_HOST_IP=${SEARCH_HOST_IP}
22+
export SEARCH_WEB_RETRIEVER_SERVICE_HOST_IP=${SEARCH_HOST_IP}
23+
export SEARCH_RERANK_SERVICE_HOST_IP=${SEARCH_HOST_IP}
24+
export SEARCH_LLM_SERVICE_HOST_IP=${SEARCH_HOST_IP}
25+
26+
export SEARCH_EMBEDDING_SERVICE_PORT=3002
27+
export SEARCH_WEB_RETRIEVER_SERVICE_PORT=3003
28+
export SEARCH_RERANK_SERVICE_PORT=3005
29+
export SEARCH_LLM_SERVICE_PORT=3007
30+
31+
export SEARCH_FRONTEND_SERVICE_PORT=18143
32+
export SEARCH_BACKEND_SERVICE_PORT=18142
33+
export SEARCH_BACKEND_SERVICE_ENDPOINT=http://${SEARCH_EXTERNAL_HOST_IP}:${SEARCH_BACKEND_SERVICE_PORT}/v1/searchqna
34+
35+
export SEARCH_GOOGLE_API_KEY=${GOOGLE_API_KEY}
36+
export SEARCH_GOOGLE_CSE_ID=${GOOGLE_CSE_ID}

0 commit comments

Comments
 (0)