Skip to content

Commit c50dfb2

Browse files
Adding files to deploy ChatQnA application on ROCm vLLM (#1560)
Signed-off-by: Chingis Yundunov <YundunovCN@sibedge.com>
1 parent 4ce847c commit c50dfb2

19 files changed

+1679
-605
lines changed

ChatQnA/README.md

Lines changed: 14 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -23,14 +23,15 @@ RAG bridges the knowledge gap by dynamically fetching relevant information from
2323
| Azure | 5th Gen Intel Xeon with Intel AMX | Work-in-progress | Work-in-progress |
2424
| Intel Tiber AI Cloud | 5th Gen Intel Xeon with Intel AMX | Work-in-progress | Work-in-progress |
2525

26-
## Automated Deployment to Ubuntu based system(if not using Terraform) using Intel® Optimized Cloud Modules for **Ansible**
26+
## Automated Deployment to Ubuntu based system (if not using Terraform) using Intel® Optimized Cloud Modules for **Ansible**
2727

2828
To deploy to existing Xeon Ubuntu based system, use our Intel Optimized Cloud Modules for Ansible. This is the same Ansible playbook used by Terraform.
2929
Use this if you are not using Terraform and have provisioned your system with another tool or manually including bare metal.
30-
| Operating System | Intel Optimized Cloud Module for Ansible |
31-
|------------------|------------------------------------------|
32-
| Ubuntu 20.04 | [ChatQnA Ansible Module](https://github.com/intel/optimized-cloud-recipes/tree/main/recipes/ai-opea-chatqna-xeon) |
33-
| Ubuntu 22.04 | Work-in-progress |
30+
31+
| Operating System | Intel Optimized Cloud Module for Ansible |
32+
| ---------------- | ----------------------------------------------------------------------------------------------------------------- |
33+
| Ubuntu 20.04 | [ChatQnA Ansible Module](https://github.com/intel/optimized-cloud-recipes/tree/main/recipes/ai-opea-chatqna-xeon) |
34+
| Ubuntu 22.04 | Work-in-progress |
3435

3536
## Manually Deploy ChatQnA Service
3637

@@ -48,7 +49,7 @@ Note:
4849

4950
1. If you do not have docker installed you can run this script to install docker : `bash docker_compose/install_docker.sh`.
5051

51-
2. The default LLM is `meta-llama/Meta-Llama-3-8B-Instruct`. Before deploying the application, please make sure either you've requested and been granted the access to it on [Huggingface](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) or you've downloaded the model locally from [ModelScope](https://www.modelscope.cn/models).
52+
2. The default LLM is `meta-llama/Meta-Llama-3-8B-Instruct`. Before deploying the application, please make sure either you've requested and been granted the access to it on [Huggingface](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) `or` you've downloaded the model locally from [ModelScope](https://www.modelscope.cn/models).
5253

5354
### Quick Start: 1.Setup Environment Variable
5455

@@ -221,13 +222,14 @@ This ChatQnA use case performs RAG using LangChain, Redis VectorDB and Text Gene
221222
In the below, we provide a table that describes for each microservice component in the ChatQnA architecture, the default configuration of the open source project, hardware, port, and endpoint.
222223

223224
Gaudi default compose.yaml
224-
| MicroService | Open Source Project | HW | Port | Endpoint |
225+
226+
| MicroService | Open Source Project | HW | Port | Endpoint |
225227
| ------------ | ------------------- | ----- | ---- | -------------------- |
226-
| Embedding | Langchain | Xeon | 6000 | /v1/embeddings |
227-
| Retriever | Langchain, Redis | Xeon | 7000 | /v1/retrieval |
228-
| Reranking | Langchain, TEI | Gaudi | 8000 | /v1/reranking |
229-
| LLM | Langchain, TGI | Gaudi | 9000 | /v1/chat/completions |
230-
| Dataprep | Redis, Langchain | Xeon | 6007 | /v1/dataprep/ingest |
228+
| Embedding | Langchain | Xeon | 6000 | /v1/embeddings |
229+
| Retriever | Langchain, Redis | Xeon | 7000 | /v1/retrieval |
230+
| Reranking | Langchain, TEI | Gaudi | 8000 | /v1/reranking |
231+
| LLM | Langchain, TGI | Gaudi | 9000 | /v1/chat/completions |
232+
| Dataprep | Redis, Langchain | Xeon | 6007 | /v1/dataprep/ingest |
231233

232234
### Required Models
233235

31.9 KB
Loading

ChatQnA/assets/img/ui-result-page.png

36.2 KB
Loading
4.63 KB
Loading

0 commit comments

Comments
 (0)