Skip to content

Commit d56e1f2

Browse files
committed
Combine all ChatQnA related docker images into one
Remove Dockerfile.faqgen, Dockerfile.without_rerank, Dockerfile.guardrails Combine all types into Dockerfile, use env CHATQNA_TYPE to make selection Signed-off-by: Xinyao Wang <xinyao.wang@intel.com>
1 parent 4a849c6 commit d56e1f2

26 files changed

+62
-161
lines changed

ChatQnA/Dockerfile

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,5 +5,6 @@ ARG BASE_TAG=latest
55
FROM opea/comps-base:$BASE_TAG
66

77
COPY ./chatqna.py $HOME/chatqna.py
8+
COPY ./entrypoint.sh $HOME/entrypoint.sh
89

9-
ENTRYPOINT ["python", "chatqna.py"]
10+
ENTRYPOINT ["bash", "entrypoint.sh"]

ChatQnA/Dockerfile.faqgen

Lines changed: 0 additions & 49 deletions
This file was deleted.

ChatQnA/Dockerfile.guardrails

Lines changed: 0 additions & 9 deletions
This file was deleted.

ChatQnA/Dockerfile.without_rerank

Lines changed: 0 additions & 9 deletions
This file was deleted.

ChatQnA/docker_compose/amd/gpu/rocm/README.md

Lines changed: 8 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -115,25 +115,14 @@ docker build -t opea/llm-faqgen:latest --build-arg https_proxy=$https_proxy --bu
115115

116116
### 5. Build MegaService Docker Image
117117

118-
1. MegaService with text generation
119-
To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `chatqna.py` Python script. Build the MegaService Docker image using the command below:
118+
To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `chatqna.py` Python script. Build the MegaService Docker image using the command below:
120119

121-
```bash
122-
git clone https://github.com/opea-project/GenAIExamples.git
123-
cd GenAIExamples/ChatQnA/docker
124-
docker build --no-cache -t opea/chatqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
125-
cd ../../..
126-
```
127-
128-
2. MegaService with FAQ generation
129-
130-
To use FAQ generation instead of normal text generation LLM, please use the below command:
131-
132-
```bash
133-
git clone https://github.com/opea-project/GenAIExamples.git
134-
cd GenAIExamples/ChatQnA
135-
docker build --no-cache -t opea/chatqna-faqgen:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile.faqgen .
136-
```
120+
```bash
121+
git clone https://github.com/opea-project/GenAIExamples.git
122+
cd GenAIExamples/ChatQnA/docker
123+
docker build --no-cache -t opea/chatqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
124+
cd ../../..
125+
```
137126

138127
### 6. Build UI Docker Image
139128

@@ -166,7 +155,7 @@ Then run the command `docker images`, you will have the following 5 Docker Image
166155

167156
1. `opea/retriever:latest`
168157
2. `opea/dataprep:latest`
169-
3. `opea/chatqna:latest` or `opea/chatqna-faqgen:latest`
158+
3. `opea/chatqna:latest`
170159
4. `opea/chatqna-ui:latest` or `opea/chatqna-react-ui:latest`
171160
5. `opea/nginx:latest`
172161

ChatQnA/docker_compose/amd/gpu/rocm/compose_faqgen.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,7 @@ services:
136136
LOGFLAG: ${LOGFLAG:-False}
137137
restart: unless-stopped
138138
chatqna-backend-server:
139-
image: ${REGISTRY:-opea}/chatqna-faqgen:${TAG:-latest}
139+
image: ${REGISTRY:-opea}/chatqna:${TAG:-latest}
140140
container_name: chatqna-backend-server
141141
depends_on:
142142
- chatqna-redis-vector-db
@@ -160,6 +160,7 @@ services:
160160
- LLM_SERVER_HOST_IP=${HOST_IP}
161161
- LLM_SERVER_PORT=${CHATQNA_LLM_FAQGEN_PORT:-9000}
162162
- LLM_MODEL=${CHATQNA_LLM_MODEL_ID}
163+
- CHATQNA_TYPE=${CHATQNA_TYPE:-CHATQNA_FAQGEN}
163164
ipc: host
164165
restart: always
165166
chatqna-ui-server:

ChatQnA/docker_compose/intel/cpu/xeon/README.md

Lines changed: 7 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -153,35 +153,13 @@ docker build -t opea/llm-faqgen:latest --build-arg https_proxy=$https_proxy --bu
153153

154154
### 4. Build MegaService Docker Image
155155

156-
1. MegaService with Rerank
156+
To construct the Mega Service with Rerank, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `chatqna.py` Python script. Build MegaService Docker image via below command:
157157

158-
To construct the Mega Service with Rerank, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `chatqna.py` Python script. Build MegaService Docker image via below command:
159-
160-
```bash
161-
git clone https://github.com/opea-project/GenAIExamples.git
162-
cd GenAIExamples/ChatQnA
163-
docker build --no-cache -t opea/chatqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
164-
```
165-
166-
2. MegaService without Rerank
167-
168-
To construct the Mega Service without Rerank, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `chatqna_without_rerank.py` Python script. Build MegaService Docker image via below command:
169-
170-
```bash
171-
git clone https://github.com/opea-project/GenAIExamples.git
172-
cd GenAIExamples/ChatQnA
173-
docker build --no-cache -t opea/chatqna-without-rerank:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile.without_rerank .
174-
```
175-
176-
3. MegaService with FaqGen
177-
178-
To use FAQ generation instead of normal text generation LLM, please use the below command:
179-
180-
```bash
181-
git clone https://github.com/opea-project/GenAIExamples.git
182-
cd GenAIExamples/ChatQnA
183-
docker build --no-cache -t opea/chatqna-faqgen:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile.faqgen .
184-
```
158+
```bash
159+
git clone https://github.com/opea-project/GenAIExamples.git
160+
cd GenAIExamples/ChatQnA
161+
docker build --no-cache -t opea/chatqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
162+
```
185163

186164
### 5. Build UI Docker Image
187165

@@ -214,7 +192,7 @@ Then run the command `docker images`, you will have the following 5 Docker Image
214192

215193
1. `opea/dataprep:latest`
216194
2. `opea/retriever:latest`
217-
3. `opea/chatqna:latest` or `opea/chatqna-without-rerank:latest` or `opea/chatqna-faqgen:latest`
195+
3. `opea/chatqna:latest`
218196
4. `opea/chatqna-ui:latest`
219197
5. `opea/nginx:latest`
220198

ChatQnA/docker_compose/intel/cpu/xeon/compose_faqgen.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -118,7 +118,7 @@ services:
118118
LOGFLAG: ${LOGFLAG:-False}
119119
restart: unless-stopped
120120
chatqna-xeon-backend-server:
121-
image: ${REGISTRY:-opea}/chatqna-faqgen:${TAG:-latest}
121+
image: ${REGISTRY:-opea}/chatqna:${TAG:-latest}
122122
container_name: chatqna-xeon-backend-server
123123
depends_on:
124124
- redis-vector-db
@@ -144,6 +144,7 @@ services:
144144
- LLM_SERVER_PORT=9000
145145
- LLM_MODEL=${LLM_MODEL_ID}
146146
- LOGFLAG=${LOGFLAG}
147+
- CHATQNA_TYPE=${CHATQNA_TYPE:-CHATQNA_FAQGEN}
147148
ipc: host
148149
restart: always
149150
chatqna-xeon-ui-server:

ChatQnA/docker_compose/intel/cpu/xeon/compose_faqgen_tgi.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -118,7 +118,7 @@ services:
118118
LOGFLAG: ${LOGFLAG:-False}
119119
restart: unless-stopped
120120
chatqna-xeon-backend-server:
121-
image: ${REGISTRY:-opea}/chatqna-faqgen:${TAG:-latest}
121+
image: ${REGISTRY:-opea}/chatqna:${TAG:-latest}
122122
container_name: chatqna-xeon-backend-server
123123
depends_on:
124124
- redis-vector-db
@@ -144,6 +144,7 @@ services:
144144
- LLM_SERVER_PORT=9000
145145
- LLM_MODEL=${LLM_MODEL_ID}
146146
- LOGFLAG=${LOGFLAG}
147+
- CHATQNA_TYPE=${CHATQNA_TYPE:-CHATQNA_FAQGEN}
147148
ipc: host
148149
restart: always
149150
chatqna-xeon-ui-server:

ChatQnA/docker_compose/intel/cpu/xeon/compose_without_rerank.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,7 @@ services:
7575
VLLM_TORCH_PROFILER_DIR: "/mnt"
7676
command: --model $LLM_MODEL_ID --host 0.0.0.0 --port 80
7777
chatqna-xeon-backend-server:
78-
image: ${REGISTRY:-opea}/chatqna-without-rerank:${TAG:-latest}
78+
image: ${REGISTRY:-opea}/chatqna:${TAG:-latest}
7979
container_name: chatqna-xeon-backend-server
8080
depends_on:
8181
- redis-vector-db
@@ -97,6 +97,7 @@ services:
9797
- LLM_SERVER_PORT=${LLM_SERVER_PORT:-80}
9898
- LLM_MODEL=${LLM_MODEL_ID}
9999
- LOGFLAG=${LOGFLAG}
100+
- CHATQNA_TYPE=${CHATQNA_TYPE:-CHATQNA_NO_RERANK}
100101
ipc: host
101102
restart: always
102103
chatqna-xeon-ui-server:

ChatQnA/docker_compose/intel/hpu/gaudi/README.md

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -176,7 +176,7 @@ This deployment may allocate more Gaudi resources to the tgi-service to optimize
176176

177177
### compose_faqgen.yaml - FAQ generation Deployment
178178

179-
The FAQs(frequently asked questions and answers) generation Deployment will generate FAQs instead of normally text generation. It add a new microservice called `llm-faqgen`, which is a microservice that interacts with the TGI/vLLM LLM server to generate FAQs from input text. Chatqna backend image change from `opea/chatqna:latest` to `opea/chatqna-faqgen:latest`, which depends on `llm-faqgen`.
179+
The FAQs(frequently asked questions and answers) generation Deployment will generate FAQs instead of normally text generation. It add a new microservice called `llm-faqgen`, which is a microservice that interacts with the TGI/vLLM LLM server to generate FAQs from input text.
180180

181181
The TGI (Text Generation Inference) deployment and the default deployment differ primarily in their service configurations and specific focus on handling large language models (LLMs). The TGI deployment includes a unique `tgi-service`, which utilizes the `ghcr.io/huggingface/tgi-gaudi:2.0.6` image and is specifically configured to run on Gaudi hardware. This service is designed to handle LLM tasks with optimizations such as `ENABLE_HPU_GRAPH` and `USE_FLASH_ATTENTION`. The `chatqna-gaudi-backend-server` in the TGI deployment depends on the `tgi-service`, whereas in the default deployment, it relies on the `vllm-service`.
182182

@@ -188,16 +188,16 @@ The TGI (Text Generation Inference) deployment and the default deployment differ
188188
| retriever | opea/retriever:latest | No |
189189
| tei-reranking-service | ghcr.io/huggingface/tei-gaudi:1.5.0 | 1 card |
190190
| vllm-service | opea/vllm-gaudi:latest | Configurable |
191-
| llm-faqgen | opea/llm-faqgen:latest | No |
192-
| chatqna-gaudi-backend-server | opea/chatqna-faqgen:latest | No |
191+
| **llm-faqgen** | **opea/llm-faqgen:latest** | No |
192+
| chatqna-gaudi-backend-server | opea/chatqna:latest | No |
193193
| chatqna-gaudi-ui-server | opea/chatqna-ui:latest | No |
194194
| chatqna-gaudi-nginx-server | opea/nginx:latest | No |
195195

196196
We also provided a TGI based deployment for FAQ generation `compose_faqgen_tgi.yaml`, which only replace `vllm-service` with `tgi-service`.
197197

198198
### compose_without_rerank.yaml - No ReRank Deployment
199199

200-
The _compose_without_rerank.yaml_ Docker Compose file is distinct from the default deployment primarily due to the exclusion of the reranking service. In this version, the `tei-reranking-service`, which is typically responsible for providing reranking capabilities for text embeddings and is configured to run on Gaudi hardware, is absent. This omission simplifies the service architecture by removing a layer of processing that would otherwise enhance the ranking of text embeddings. Consequently, the `chatqna-gaudi-backend-server` in this deployment uses a specialized image, `opea/chatqna-without-rerank:latest`, indicating that it is tailored to function without the reranking feature. As a result, the backend server's dependencies are adjusted, without the need for the reranking service. This streamlined setup may impact the application's functionality and performance by focusing on core operations without the additional processing layer provided by reranking, potentially making it more efficient for scenarios where reranking is not essential and freeing Intel® Gaudi® accelerators for other tasks.
200+
The _compose_without_rerank.yaml_ Docker Compose file is distinct from the default deployment primarily due to the exclusion of the reranking service. In this version, the `tei-reranking-service`, which is typically responsible for providing reranking capabilities for text embeddings and is configured to run on Gaudi hardware, is absent. This omission simplifies the service architecture by removing a layer of processing that would otherwise enhance the ranking of text embeddings. As a result, the backend server's dependencies are adjusted, without the need for the reranking service. This streamlined setup may impact the application's functionality and performance by focusing on core operations without the additional processing layer provided by reranking, potentially making it more efficient for scenarios where reranking is not essential and freeing Intel® Gaudi® accelerators for other tasks.
201201

202202
| Service Name | Image Name | Gaudi Specific |
203203
| ---------------------------- | ----------------------------------------------------- | -------------- |
@@ -206,15 +206,15 @@ The _compose_without_rerank.yaml_ Docker Compose file is distinct from the defau
206206
| tei-embedding-service | ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 | No |
207207
| retriever | opea/retriever:latest | No |
208208
| vllm-service | opea/vllm-gaudi:latest | Configurable |
209-
| chatqna-gaudi-backend-server | **opea/chatqna-without-rerank:latest** | No |
209+
| chatqna-gaudi-backend-server | opea/chatqna:latest | No |
210210
| chatqna-gaudi-ui-server | opea/chatqna-ui:latest | No |
211211
| chatqna-gaudi-nginx-server | opea/nginx:latest | No |
212212

213213
This setup might allow for more Gaudi devices to be dedicated to the `vllm-service`, enhancing LLM processing capabilities and accommodating larger models. However, it also means that the benefits of reranking are sacrificed, which could impact the overall quality of the pipeline's output.
214214

215215
### compose_guardrails.yaml - Guardrails Deployment
216216

217-
The _compose_guardrails.yaml_ Docker Compose file introduces enhancements over the default deployment by incorporating additional services focused on safety and ChatQnA response control. Notably, it includes the `tgi-guardrails-service` and `guardrails` services. The `tgi-guardrails-service` uses the `ghcr.io/huggingface/tgi-gaudi:2.0.6` image and is configured to run on Gaudi hardware, providing functionality to manage input constraints and ensure safe operations within defined limits. The guardrails service, using the `opea/guardrails:latest` image, acts as a safety layer that interfaces with the `tgi-guardrails-service` to enforce safety protocols and manage interactions with the large language model (LLM). Additionally, the `chatqna-gaudi-backend-server` is updated to use the `opea/chatqna-guardrails:latest` image, indicating its design to integrate with these new guardrail services. This backend server now depends on the `tgi-guardrails-service` and `guardrails`, alongside existing dependencies like `redis-vector-db`, `tei-embedding-service`, `retriever`, `tei-reranking-service`, and `vllm-service`. The environment configurations for the backend are also updated to include settings for the guardrail services.
217+
The _compose_guardrails.yaml_ Docker Compose file introduces enhancements over the default deployment by incorporating additional services focused on safety and ChatQnA response control. Notably, it includes the `tgi-guardrails-service` and `guardrails` services. The `tgi-guardrails-service` uses the `ghcr.io/huggingface/tgi-gaudi:2.0.6` image and is configured to run on Gaudi hardware, providing functionality to manage input constraints and ensure safe operations within defined limits. The guardrails service, using the `opea/guardrails:latest` image, acts as a safety layer that interfaces with the `tgi-guardrails-service` to enforce safety protocols and manage interactions with the large language model (LLM). This backend server now depends on the `tgi-guardrails-service` and `guardrails`, alongside existing dependencies like `redis-vector-db`, `tei-embedding-service`, `retriever`, `tei-reranking-service`, and `vllm-service`. The environment configurations for the backend are also updated to include settings for the guardrail services.
218218

219219
| Service Name | Image Name | Gaudi Specific | Uses LLM |
220220
| ---------------------------- | ----------------------------------------------------- | -------------- | -------- |
@@ -226,7 +226,7 @@ The _compose_guardrails.yaml_ Docker Compose file introduces enhancements over t
226226
| retriever | opea/retriever:latest | No | No |
227227
| tei-reranking-service | ghcr.io/huggingface/tei-gaudi:1.5.0 | 1 card | No |
228228
| vllm-service | opea/vllm-gaudi:latest | Configurable | Yes |
229-
| chatqna-gaudi-backend-server | opea/chatqna-guardrails:latest | No | No |
229+
| chatqna-gaudi-backend-server | opea/chatqna:latest | No | No |
230230
| chatqna-gaudi-ui-server | opea/chatqna-ui:latest | No | No |
231231
| chatqna-gaudi-nginx-server | opea/nginx:latest | No | No |
232232

@@ -266,8 +266,6 @@ The table provides a comprehensive overview of the ChatQnA services utilized acr
266266
| tgi-guardrails-service | ghcr.io/huggingface/tgi-gaudi:2.0.6 | Yes | Provides guardrails functionality, ensuring safe operations within defined limits. |
267267
| guardrails | opea/guardrails:latest | Yes | Acts as a safety layer, interfacing with the `tgi-guardrails-service` to enforce safety protocols. |
268268
| chatqna-gaudi-backend-server | opea/chatqna:latest | No | Serves as the backend for the ChatQnA application, with variations depending on the deployment. |
269-
| | opea/chatqna-without-rerank:latest | | |
270-
| | opea/chatqna-guardrails:latest | | |
271269
| chatqna-gaudi-ui-server | opea/chatqna-ui:latest | No | Provides the user interface for the ChatQnA application. |
272270
| chatqna-gaudi-nginx-server | opea/nginx:latest | No | Acts as a reverse proxy, managing traffic between the UI and backend services. |
273271
| jaeger | jaegertracing/all-in-one:latest | Yes | Provides tracing and monitoring capabilities for distributed systems. |

0 commit comments

Comments
 (0)