opea-project
diff --git a/‎ChatQnA/Dockerfile
Lines changed: 2 additions & 1 deletion b/‎ChatQnA/Dockerfile
Lines changed: 2 additions & 1 deletion
diff --git a/‎ChatQnA/Dockerfile.faqgen
Lines changed: 0 additions & 49 deletions b/‎ChatQnA/Dockerfile.faqgen
Lines changed: 0 additions & 49 deletions
diff --git a/‎ChatQnA/Dockerfile.guardrails
Lines changed: 0 additions & 9 deletions b/‎ChatQnA/Dockerfile.guardrails
Lines changed: 0 additions & 9 deletions
diff --git a/‎ChatQnA/Dockerfile.without_rerank
Lines changed: 0 additions & 9 deletions b/‎ChatQnA/Dockerfile.without_rerank
Lines changed: 0 additions & 9 deletions
diff --git a/‎ChatQnA/docker_compose/amd/gpu/rocm/README.md
Lines changed: 8 additions & 19 deletions b/‎ChatQnA/docker_compose/amd/gpu/rocm/README.md
Lines changed: 8 additions & 19 deletions
diff --git a/‎ChatQnA/docker_compose/amd/gpu/rocm/compose_faqgen.yaml
Lines changed: 2 additions & 1 deletion b/‎ChatQnA/docker_compose/amd/gpu/rocm/compose_faqgen.yaml
Lines changed: 2 additions & 1 deletion
diff --git a/‎ChatQnA/docker_compose/intel/cpu/xeon/README.md
Lines changed: 7 additions & 29 deletions b/‎ChatQnA/docker_compose/intel/cpu/xeon/README.md
Lines changed: 7 additions & 29 deletions
diff --git a/‎ChatQnA/docker_compose/intel/cpu/xeon/compose_faqgen.yaml
Lines changed: 2 additions & 1 deletion b/‎ChatQnA/docker_compose/intel/cpu/xeon/compose_faqgen.yaml
Lines changed: 2 additions & 1 deletion
diff --git a/‎ChatQnA/docker_compose/intel/cpu/xeon/compose_faqgen_tgi.yaml
Lines changed: 2 additions & 1 deletion b/‎ChatQnA/docker_compose/intel/cpu/xeon/compose_faqgen_tgi.yaml
Lines changed: 2 additions & 1 deletion
diff --git a/‎ChatQnA/docker_compose/intel/cpu/xeon/compose_without_rerank.yaml
Lines changed: 2 additions & 1 deletion b/‎ChatQnA/docker_compose/intel/cpu/xeon/compose_without_rerank.yaml
Lines changed: 2 additions & 1 deletion
diff --git a/‎ChatQnA/docker_compose/intel/hpu/gaudi/README.md
Lines changed: 7 additions & 9 deletions b/‎ChatQnA/docker_compose/intel/hpu/gaudi/README.md
Lines changed: 7 additions & 9 deletions
@@ -5,5 +5,6 @@ ARG BASE_TAG=latest
 FROM opea/comps-base:$BASE_TAG
 
 COPY ./chatqna.py $HOME/chatqna.py
+COPY ./entrypoint.sh $HOME/entrypoint.sh
 
-ENTRYPOINT ["python", "chatqna.py"]
+ENTRYPOINT ["bash", "entrypoint.sh"]
@@ -115,25 +115,14 @@ docker build -t opea/llm-faqgen:latest --build-arg https_proxy=$https_proxy --bu
 
 ### 5. Build MegaService Docker Image
 
-1. MegaService with text generation
-   To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `chatqna.py` Python script. Build the MegaService Docker image using the command below:
+To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `chatqna.py` Python script. Build the MegaService Docker image using the command below:
 
-   ```bash
-   git clone https://github.com/opea-project/GenAIExamples.git
-   cd GenAIExamples/ChatQnA/docker
-   docker build --no-cache -t opea/chatqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
-   cd ../../..
-   ```
-
-2. MegaService with FAQ generation
-
-   To use FAQ generation instead of normal text generation LLM, please use the below command:
-
-   ```bash
-   git clone https://github.com/opea-project/GenAIExamples.git
-   cd GenAIExamples/ChatQnA
-   docker build --no-cache -t opea/chatqna-faqgen:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile.faqgen .
-   ```
+```bash
+git clone https://github.com/opea-project/GenAIExamples.git
+cd GenAIExamples/ChatQnA/docker
+docker build --no-cache -t opea/chatqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
+cd ../../..
+```
 
 ### 6. Build UI Docker Image
 
@@ -166,7 +155,7 @@ Then run the command `docker images`, you will have the following 5 Docker Image
 
 1. `opea/retriever:latest`
 2. `opea/dataprep:latest`
-3. `opea/chatqna:latest` or `opea/chatqna-faqgen:latest`
+3. `opea/chatqna:latest`
 4. `opea/chatqna-ui:latest` or `opea/chatqna-react-ui:latest`
 5. `opea/nginx:latest`
 
 
@@ -136,7 +136,7 @@ services:
       LOGFLAG: ${LOGFLAG:-False}
     restart: unless-stopped
   chatqna-backend-server:
-    image: ${REGISTRY:-opea}/chatqna-faqgen:${TAG:-latest}
+    image: ${REGISTRY:-opea}/chatqna:${TAG:-latest}
     container_name: chatqna-backend-server
     depends_on:
       - chatqna-redis-vector-db
@@ -160,6 +160,7 @@ services:
       - LLM_SERVER_HOST_IP=${HOST_IP}
       - LLM_SERVER_PORT=${CHATQNA_LLM_FAQGEN_PORT:-9000}
       - LLM_MODEL=${CHATQNA_LLM_MODEL_ID}
+      - CHATQNA_TYPE=${CHATQNA_TYPE:-CHATQNA_FAQGEN}
     ipc: host
     restart: always
   chatqna-ui-server:
 
@@ -153,35 +153,13 @@ docker build -t opea/llm-faqgen:latest --build-arg https_proxy=$https_proxy --bu
 
 ### 4. Build MegaService Docker Image
 
-1. MegaService with Rerank
+To construct the Mega Service with Rerank, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `chatqna.py` Python script. Build MegaService Docker image via below command:
 
-   To construct the Mega Service with Rerank, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `chatqna.py` Python script. Build MegaService Docker image via below command:
-
-   ```bash
-   git clone https://github.com/opea-project/GenAIExamples.git
-   cd GenAIExamples/ChatQnA
-   docker build --no-cache -t opea/chatqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
-   ```
-
-2. MegaService without Rerank
-
-   To construct the Mega Service without Rerank, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `chatqna_without_rerank.py` Python script. Build MegaService Docker image via below command:
-
-   ```bash
-   git clone https://github.com/opea-project/GenAIExamples.git
-   cd GenAIExamples/ChatQnA
-   docker build --no-cache -t opea/chatqna-without-rerank:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile.without_rerank .
-   ```
-
-3. MegaService with FaqGen
-
-   To use FAQ generation instead of normal text generation LLM, please use the below command:
-
-   ```bash
-   git clone https://github.com/opea-project/GenAIExamples.git
-   cd GenAIExamples/ChatQnA
-   docker build --no-cache -t opea/chatqna-faqgen:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile.faqgen .
-   ```
+```bash
+git clone https://github.com/opea-project/GenAIExamples.git
+cd GenAIExamples/ChatQnA
+docker build --no-cache -t opea/chatqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
+```
 
 ### 5. Build UI Docker Image
 
@@ -214,7 +192,7 @@ Then run the command `docker images`, you will have the following 5 Docker Image
 
 1. `opea/dataprep:latest`
 2. `opea/retriever:latest`
-3. `opea/chatqna:latest` or `opea/chatqna-without-rerank:latest` or `opea/chatqna-faqgen:latest`
+3. `opea/chatqna:latest`
 4. `opea/chatqna-ui:latest`
 5. `opea/nginx:latest`
 
 
@@ -118,7 +118,7 @@ services:
       LOGFLAG: ${LOGFLAG:-False}
     restart: unless-stopped
   chatqna-xeon-backend-server:
-    image: ${REGISTRY:-opea}/chatqna-faqgen:${TAG:-latest}
+    image: ${REGISTRY:-opea}/chatqna:${TAG:-latest}
     container_name: chatqna-xeon-backend-server
     depends_on:
       - redis-vector-db
@@ -144,6 +144,7 @@ services:
       - LLM_SERVER_PORT=9000
       - LLM_MODEL=${LLM_MODEL_ID}
       - LOGFLAG=${LOGFLAG}
+      - CHATQNA_TYPE=${CHATQNA_TYPE:-CHATQNA_FAQGEN}
     ipc: host
     restart: always
   chatqna-xeon-ui-server:
 
@@ -118,7 +118,7 @@ services:
       LOGFLAG: ${LOGFLAG:-False}
     restart: unless-stopped
   chatqna-xeon-backend-server:
-    image: ${REGISTRY:-opea}/chatqna-faqgen:${TAG:-latest}
+    image: ${REGISTRY:-opea}/chatqna:${TAG:-latest}
     container_name: chatqna-xeon-backend-server
     depends_on:
       - redis-vector-db
@@ -144,6 +144,7 @@ services:
       - LLM_SERVER_PORT=9000
       - LLM_MODEL=${LLM_MODEL_ID}
       - LOGFLAG=${LOGFLAG}
+      - CHATQNA_TYPE=${CHATQNA_TYPE:-CHATQNA_FAQGEN}
     ipc: host
     restart: always
   chatqna-xeon-ui-server:
 
@@ -75,7 +75,7 @@ services:
       VLLM_TORCH_PROFILER_DIR: "/mnt"
     command: --model $LLM_MODEL_ID --host 0.0.0.0 --port 80
   chatqna-xeon-backend-server:
-    image: ${REGISTRY:-opea}/chatqna-without-rerank:${TAG:-latest}
+    image: ${REGISTRY:-opea}/chatqna:${TAG:-latest}
     container_name: chatqna-xeon-backend-server
     depends_on:
       - redis-vector-db
@@ -97,6 +97,7 @@ services:
       - LLM_SERVER_PORT=${LLM_SERVER_PORT:-80}
       - LLM_MODEL=${LLM_MODEL_ID}
       - LOGFLAG=${LOGFLAG}
+      - CHATQNA_TYPE=${CHATQNA_TYPE:-CHATQNA_NO_RERANK}
     ipc: host
     restart: always
   chatqna-xeon-ui-server:
 
@@ -176,7 +176,7 @@ This deployment may allocate more Gaudi resources to the tgi-service to optimize
 
 ### compose_faqgen.yaml - FAQ generation Deployment
 
-The FAQs(frequently asked questions and answers) generation Deployment will generate FAQs instead of normally text generation. It add a new microservice called `llm-faqgen`, which is a microservice that interacts with the TGI/vLLM LLM server to generate FAQs from input text. Chatqna backend image change from `opea/chatqna:latest` to `opea/chatqna-faqgen:latest`, which depends on `llm-faqgen`.
+The FAQs(frequently asked questions and answers) generation Deployment will generate FAQs instead of normally text generation. It add a new microservice called `llm-faqgen`, which is a microservice that interacts with the TGI/vLLM LLM server to generate FAQs from input text.
 
 The TGI (Text Generation Inference) deployment and the default deployment differ primarily in their service configurations and specific focus on handling large language models (LLMs). The TGI deployment includes a unique `tgi-service`, which utilizes the `ghcr.io/huggingface/tgi-gaudi:2.0.6` image and is specifically configured to run on Gaudi hardware. This service is designed to handle LLM tasks with optimizations such as `ENABLE_HPU_GRAPH` and `USE_FLASH_ATTENTION`. The `chatqna-gaudi-backend-server` in the TGI deployment depends on the `tgi-service`, whereas in the default deployment, it relies on the `vllm-service`.
 
@@ -188,16 +188,16 @@ The TGI (Text Generation Inference) deployment and the default deployment differ
 | retriever                    | opea/retriever:latest                                 | No           |
 | tei-reranking-service        | ghcr.io/huggingface/tei-gaudi:1.5.0                   | 1 card       |
 | vllm-service                 | opea/vllm-gaudi:latest                                | Configurable |
-| llm-faqgen                   | opea/llm-faqgen:latest                                | No           |
-| chatqna-gaudi-backend-server | opea/chatqna-faqgen:latest                            | No           |
+| **llm-faqgen**               | **opea/llm-faqgen:latest**                            | No           |
+| chatqna-gaudi-backend-server | opea/chatqna:latest                                   | No           |
 | chatqna-gaudi-ui-server      | opea/chatqna-ui:latest                                | No           |
 | chatqna-gaudi-nginx-server   | opea/nginx:latest                                     | No           |
 
 We also provided a TGI based deployment for FAQ generation `compose_faqgen_tgi.yaml`, which only replace `vllm-service` with `tgi-service`.
 
 ### compose_without_rerank.yaml - No ReRank Deployment
 
-The _compose_without_rerank.yaml_ Docker Compose file is distinct from the default deployment primarily due to the exclusion of the reranking service. In this version, the `tei-reranking-service`, which is typically responsible for providing reranking capabilities for text embeddings and is configured to run on Gaudi hardware, is absent. This omission simplifies the service architecture by removing a layer of processing that would otherwise enhance the ranking of text embeddings. Consequently, the `chatqna-gaudi-backend-server` in this deployment uses a specialized image, `opea/chatqna-without-rerank:latest`, indicating that it is tailored to function without the reranking feature. As a result, the backend server's dependencies are adjusted, without the need for the reranking service. This streamlined setup may impact the application's functionality and performance by focusing on core operations without the additional processing layer provided by reranking, potentially making it more efficient for scenarios where reranking is not essential and freeing Intel® Gaudi® accelerators for other tasks.
+The _compose_without_rerank.yaml_ Docker Compose file is distinct from the default deployment primarily due to the exclusion of the reranking service. In this version, the `tei-reranking-service`, which is typically responsible for providing reranking capabilities for text embeddings and is configured to run on Gaudi hardware, is absent. This omission simplifies the service architecture by removing a layer of processing that would otherwise enhance the ranking of text embeddings. As a result, the backend server's dependencies are adjusted, without the need for the reranking service. This streamlined setup may impact the application's functionality and performance by focusing on core operations without the additional processing layer provided by reranking, potentially making it more efficient for scenarios where reranking is not essential and freeing Intel® Gaudi® accelerators for other tasks.
 
 | Service Name                 | Image Name                                            | Gaudi Specific |
 | ---------------------------- | ----------------------------------------------------- | -------------- |
@@ -206,15 +206,15 @@ The _compose_without_rerank.yaml_ Docker Compose file is distinct from the defau
 | tei-embedding-service        | ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 | No             |
 | retriever                    | opea/retriever:latest                                 | No             |
 | vllm-service                 | opea/vllm-gaudi:latest                                | Configurable   |
-| chatqna-gaudi-backend-server | **opea/chatqna-without-rerank:latest**                | No             |
+| chatqna-gaudi-backend-server | opea/chatqna:latest                                   | No             |
 | chatqna-gaudi-ui-server      | opea/chatqna-ui:latest                                | No             |
 | chatqna-gaudi-nginx-server   | opea/nginx:latest                                     | No             |
 
 This setup might allow for more Gaudi devices to be dedicated to the `vllm-service`, enhancing LLM processing capabilities and accommodating larger models. However, it also means that the benefits of reranking are sacrificed, which could impact the overall quality of the pipeline's output.
 
 ### compose_guardrails.yaml - Guardrails Deployment
 
-The _compose_guardrails.yaml_ Docker Compose file introduces enhancements over the default deployment by incorporating additional services focused on safety and ChatQnA response control. Notably, it includes the `tgi-guardrails-service` and `guardrails` services. The `tgi-guardrails-service` uses the `ghcr.io/huggingface/tgi-gaudi:2.0.6` image and is configured to run on Gaudi hardware, providing functionality to manage input constraints and ensure safe operations within defined limits. The guardrails service, using the `opea/guardrails:latest` image, acts as a safety layer that interfaces with the `tgi-guardrails-service` to enforce safety protocols and manage interactions with the large language model (LLM). Additionally, the `chatqna-gaudi-backend-server` is updated to use the `opea/chatqna-guardrails:latest` image, indicating its design to integrate with these new guardrail services. This backend server now depends on the `tgi-guardrails-service` and `guardrails`, alongside existing dependencies like `redis-vector-db`, `tei-embedding-service`, `retriever`, `tei-reranking-service`, and `vllm-service`. The environment configurations for the backend are also updated to include settings for the guardrail services.
+The _compose_guardrails.yaml_ Docker Compose file introduces enhancements over the default deployment by incorporating additional services focused on safety and ChatQnA response control. Notably, it includes the `tgi-guardrails-service` and `guardrails` services. The `tgi-guardrails-service` uses the `ghcr.io/huggingface/tgi-gaudi:2.0.6` image and is configured to run on Gaudi hardware, providing functionality to manage input constraints and ensure safe operations within defined limits. The guardrails service, using the `opea/guardrails:latest` image, acts as a safety layer that interfaces with the `tgi-guardrails-service` to enforce safety protocols and manage interactions with the large language model (LLM). This backend server now depends on the `tgi-guardrails-service` and `guardrails`, alongside existing dependencies like `redis-vector-db`, `tei-embedding-service`, `retriever`, `tei-reranking-service`, and `vllm-service`. The environment configurations for the backend are also updated to include settings for the guardrail services.
 
 | Service Name                 | Image Name                                            | Gaudi Specific | Uses LLM |
 | ---------------------------- | ----------------------------------------------------- | -------------- | -------- |
@@ -226,7 +226,7 @@ The _compose_guardrails.yaml_ Docker Compose file introduces enhancements over t
 | retriever                    | opea/retriever:latest                                 | No             | No       |
 | tei-reranking-service        | ghcr.io/huggingface/tei-gaudi:1.5.0                   | 1 card         | No       |
 | vllm-service                 | opea/vllm-gaudi:latest                                | Configurable   | Yes      |
-| chatqna-gaudi-backend-server | opea/chatqna-guardrails:latest                        | No             | No       |
+| chatqna-gaudi-backend-server | opea/chatqna:latest                                   | No             | No       |
 | chatqna-gaudi-ui-server      | opea/chatqna-ui:latest                                | No             | No       |
 | chatqna-gaudi-nginx-server   | opea/nginx:latest                                     | No             | No       |
 
@@ -266,8 +266,6 @@ The table provides a comprehensive overview of the ChatQnA services utilized acr
 | tgi-guardrails-service       | ghcr.io/huggingface/tgi-gaudi:2.0.6                   | Yes      | Provides guardrails functionality, ensuring safe operations within defined limits.                 |
 | guardrails                   | opea/guardrails:latest                                | Yes      | Acts as a safety layer, interfacing with the `tgi-guardrails-service` to enforce safety protocols. |
 | chatqna-gaudi-backend-server | opea/chatqna:latest                                   | No       | Serves as the backend for the ChatQnA application, with variations depending on the deployment.    |
-|                              | opea/chatqna-without-rerank:latest                    |          |                                                                                                    |
-|                              | opea/chatqna-guardrails:latest                        |          |                                                                                                    |
 | chatqna-gaudi-ui-server      | opea/chatqna-ui:latest                                | No       | Provides the user interface for the ChatQnA application.                                           |
 | chatqna-gaudi-nginx-server   | opea/nginx:latest                                     | No       | Acts as a reverse proxy, managing traffic between the UI and backend services.                     |
 | jaeger                       | jaegertracing/all-in-one:latest                       | Yes      | Provides tracing and monitoring capabilities for distributed systems.                              |