Skip to content

Commit 0eedbbf

Browse files
lvliang-intelpre-commit-ci[bot]chensuyue
authored
Update aipc ollama docker compose and readme (#984)
Signed-off-by: lvliang-intel <liang1.lv@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: chen, suyue <suyue.chen@intel.com>
1 parent 9438d39 commit 0eedbbf

17 files changed

+67
-86
lines changed

ChatQnA/chatqna.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ def generate_rag_prompt(question, documents):
4747
RERANK_SERVER_PORT = int(os.getenv("RERANK_SERVER_PORT", 80))
4848
LLM_SERVER_HOST_IP = os.getenv("LLM_SERVER_HOST_IP", "0.0.0.0")
4949
LLM_SERVER_PORT = int(os.getenv("LLM_SERVER_PORT", 80))
50+
LLM_MODEL = os.getenv("LLM_MODEL", "Intel/neural-chat-7b-v3-3")
5051

5152

5253
def align_inputs(self, inputs, cur_node, runtime_graph, llm_parameters_dict, **kwargs):
@@ -61,7 +62,7 @@ def align_inputs(self, inputs, cur_node, runtime_graph, llm_parameters_dict, **k
6162
elif self.services[cur_node].service_type == ServiceType.LLM:
6263
# convert TGI/vLLM to unified OpenAI /v1/chat/completions format
6364
next_inputs = {}
64-
next_inputs["model"] = "tgi" # specifically clarify the fake model to make the format unified
65+
next_inputs["model"] = LLM_MODEL
6566
next_inputs["messages"] = [{"role": "user", "content": inputs["inputs"]}]
6667
next_inputs["max_tokens"] = llm_parameters_dict["max_tokens"]
6768
next_inputs["top_p"] = llm_parameters_dict["top_p"]

ChatQnA/docker_compose/intel/cpu/aipc/README.md

Lines changed: 30 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -78,26 +78,27 @@ llama3.2:latest a80c4f17acd5 2.0 GB 2 minutes ago
7878
Access ollama service to verify that the ollama is functioning correctly.
7979

8080
```bash
81-
curl http://${host_ip}:11434/api/generate -d '{"model": "llama3.2", "prompt":"What is Deep Learning?"}'
81+
curl http://${host_ip}:11434/v1/chat/completions \
82+
-H "Content-Type: application/json" \
83+
-d '{
84+
"model": "llama3.2",
85+
"messages": [
86+
{
87+
"role": "system",
88+
"content": "You are a helpful assistant."
89+
},
90+
{
91+
"role": "user",
92+
"content": "Hello!"
93+
}
94+
]
95+
}'
8296
```
8397

8498
The outputs are similar to these:
8599

86100
```
87-
{"model":"llama3.2","created_at":"2024-10-12T12:55:28.098813868Z","response":"Deep","done":false}
88-
{"model":"llama3.2","created_at":"2024-10-12T12:55:28.124514468Z","response":" learning","done":false}
89-
{"model":"llama3.2","created_at":"2024-10-12T12:55:28.149754216Z","response":" is","done":false}
90-
{"model":"llama3.2","created_at":"2024-10-12T12:55:28.180420784Z","response":" a","done":false}
91-
{"model":"llama3.2","created_at":"2024-10-12T12:55:28.229185873Z","response":" subset","done":false}
92-
{"model":"llama3.2","created_at":"2024-10-12T12:55:28.263956118Z","response":" of","done":false}
93-
{"model":"llama3.2","created_at":"2024-10-12T12:55:28.289097354Z","response":" machine","done":false}
94-
{"model":"llama3.2","created_at":"2024-10-12T12:55:28.316838918Z","response":" learning","done":false}
95-
{"model":"llama3.2","created_at":"2024-10-12T12:55:28.342309506Z","response":" that","done":false}
96-
{"model":"llama3.2","created_at":"2024-10-12T12:55:28.367221264Z","response":" involves","done":false}
97-
{"model":"llama3.2","created_at":"2024-10-12T12:55:28.39205893Z","response":" the","done":false}
98-
{"model":"llama3.2","created_at":"2024-10-12T12:55:28.417933974Z","response":" use","done":false}
99-
{"model":"llama3.2","created_at":"2024-10-12T12:55:28.443110388Z","response":" of","done":false}
100-
...
101+
{"id":"chatcmpl-4","object":"chat.completion","created":1729232496,"model":"llama3.2","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"How can I assist you today? Are you looking for information, answers to a question, or just need someone to chat with? I'm here to help in any way I can."},"finish_reason":"stop"}],"usage":{"prompt_tokens":33,"completion_tokens":38,"total_tokens":71}}
101102
```
102103

103104
## 🚀 Build Docker Images
@@ -122,20 +123,14 @@ export https_proxy="Your_HTTPs_Proxy"
122123
docker build --no-cache -t opea/retriever-redis:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/retrievers/redis/langchain/Dockerfile .
123124
```
124125

125-
### 2 Build LLM Image
126-
127-
```bash
128-
docker build --no-cache -t opea/llm-ollama:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/ollama/langchain/Dockerfile .
129-
```
130-
131-
### 3. Build Dataprep Image
126+
### 2. Build Dataprep Image
132127

133128
```bash
134129
docker build --no-cache -t opea/dataprep-redis:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/redis/langchain/Dockerfile .
135130
cd ..
136131
```
137132

138-
### 4. Build MegaService Docker Image
133+
### 3. Build MegaService Docker Image
139134

140135
To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `chatqna.py` Python script. Build MegaService Docker image via below command:
141136

@@ -146,7 +141,7 @@ cd GenAIExamples/ChatQnA
146141
docker build --no-cache -t opea/chatqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
147142
```
148143

149-
### 5. Build UI Docker Image
144+
### 4. Build UI Docker Image
150145

151146
Build frontend Docker image via below command:
152147

@@ -155,7 +150,7 @@ cd ~/OPEA/GenAIExamples/ChatQnA/ui
155150
docker build --no-cache -t opea/chatqna-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile .
156151
```
157152

158-
### 6. Build Nginx Docker Image
153+
### 5. Build Nginx Docker Image
159154

160155
```bash
161156
cd GenAIComps
@@ -166,10 +161,9 @@ Then run the command `docker images`, you will have the following 6 Docker Image
166161

167162
1. `opea/dataprep-redis:latest`
168163
2. `opea/retriever-redis:latest`
169-
3. `opea/llm-ollama:latest`
170-
4. `opea/chatqna:latest`
171-
5. `opea/chatqna-ui:latest`
172-
6. `opea/nginx:latest`
164+
3. `opea/chatqna:latest`
165+
4. `opea/chatqna-ui:latest`
166+
5. `opea/nginx:latest`
173167

174168
## 🚀 Start Microservices
175169

@@ -195,10 +189,10 @@ For Linux users, please run `hostname -I | awk '{print $1}'`. For Windows users,
195189
export your_hf_api_token="Your_Huggingface_API_Token"
196190
```
197191

198-
**Append the value of the public IP address to the no_proxy list**
192+
**Append the value of the public IP address to the no_proxy list if you are in a proxy environment**
199193

200194
```
201-
export your_no_proxy=${your_no_proxy},"External_Public_IP"
195+
export your_no_proxy=${your_no_proxy},"External_Public_IP",chatqna-aipc-backend-server,tei-embedding-service,retriever,tei-reranking-service,redis-vector-db,dataprep-redis-service
202196
```
203197

204198
- Linux PC
@@ -211,7 +205,7 @@ export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
211205
export RERANK_MODEL_ID="BAAI/bge-reranker-base"
212206
export INDEX_NAME="rag-redis"
213207
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
214-
export OLLAMA_ENDPOINT=http://${host_ip}:11434
208+
export OLLAMA_HOST=${host_ip}
215209
export OLLAMA_MODEL="llama3.2"
216210
```
217211

@@ -222,7 +216,7 @@ set EMBEDDING_MODEL_ID=BAAI/bge-base-en-v1.5
222216
set RERANK_MODEL_ID=BAAI/bge-reranker-base
223217
set INDEX_NAME=rag-redis
224218
set HUGGINGFACEHUB_API_TOKEN=%your_hf_api_token%
225-
set OLLAMA_ENDPOINT=http://host.docker.internal:11434
219+
set OLLAMA_HOST=host.docker.internal
226220
set OLLAMA_MODEL="llama3.2"
227221
```
228222

@@ -277,24 +271,15 @@ For details on how to verify the correctness of the response, refer to [how-to-v
277271
curl http://${host_ip}:11434/api/generate -d '{"model": "llama3.2", "prompt":"What is Deep Learning?"}'
278272
```
279273

280-
5. LLM Microservice
281-
282-
```bash
283-
curl http://${host_ip}:9000/v1/chat/completions\
284-
-X POST \
285-
-d '{"query":"What is Deep Learning?","max_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \
286-
-H 'Content-Type: application/json'
287-
```
288-
289-
6. MegaService
274+
5. MegaService
290275

291276
```bash
292277
curl http://${host_ip}:8888/v1/chatqna -H "Content-Type: application/json" -d '{
293278
"messages": "What is the revenue of Nike in 2023?"
294279
}'
295280
```
296281

297-
7. Upload RAG Files through Dataprep Microservice (Optional)
282+
6. Upload RAG Files through Dataprep Microservice (Optional)
298283

299284
To chat with retrieved information, you need to upload a file using Dataprep service.
300285

@@ -334,4 +319,4 @@ the output is:
334319

335320
## 🚀 Launch the UI
336321

337-
To access the frontend, open the following URL in your browser: http://{host_ip}:5173.
322+
To access the frontend, open the following URL in your browser: http://{host_ip}:80.

ChatQnA/docker_compose/intel/cpu/aipc/compose.yaml

Lines changed: 13 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -72,22 +72,7 @@ services:
7272
HF_HUB_DISABLE_PROGRESS_BARS: 1
7373
HF_HUB_ENABLE_HF_TRANSFER: 0
7474
command: --model-id ${RERANK_MODEL_ID} --auto-truncate
75-
llm:
76-
image: ${REGISTRY:-opea}/llm-ollama
77-
container_name: llm-ollama
78-
ports:
79-
- "9000:9000"
80-
ipc: host
81-
environment:
82-
no_proxy: ${no_proxy}
83-
http_proxy: ${http_proxy}
84-
https_proxy: ${https_proxy}
85-
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
86-
HF_HUB_DISABLE_PROGRESS_BARS: 1
87-
HF_HUB_ENABLE_HF_TRANSFER: 0
88-
OLLAMA_ENDPOINT: ${OLLAMA_ENDPOINT}
89-
OLLAMA_MODEL: ${OLLAMA_MODEL}
90-
chaqna-aipc-backend-server:
75+
chatqna-aipc-backend-server:
9176
image: ${REGISTRY:-opea}/chatqna:${TAG:-latest}
9277
container_name: chatqna-aipc-backend-server
9378
depends_on:
@@ -96,29 +81,29 @@ services:
9681
- tei-embedding-service
9782
- retriever
9883
- tei-reranking-service
99-
- llm
10084
ports:
10185
- "8888:8888"
10286
environment:
10387
- no_proxy=${no_proxy}
10488
- https_proxy=${https_proxy}
10589
- http_proxy=${http_proxy}
106-
- MEGA_SERVICE_HOST_IP=chaqna-aipc-backend-server
90+
- MEGA_SERVICE_HOST_IP=chatqna-aipc-backend-server
10791
- EMBEDDING_SERVER_HOST_IP=tei-embedding-service
10892
- EMBEDDING_SERVER_PORT=80
10993
- RETRIEVER_SERVICE_HOST_IP=retriever
11094
- RERANK_SERVER_HOST_IP=tei-reranking-service
11195
- RERANK_SERVER_PORT=80
112-
- LLM_SERVER_HOST_IP=llm
113-
- LLM_SERVER_PORT=9000
96+
- LLM_SERVER_HOST_IP=${OLLAMA_HOST}
97+
- LLM_SERVER_PORT=11434
98+
- LLM_MODEL=${OLLAMA_MODEL}
11499
- LOGFLAG=${LOGFLAG}
115100
ipc: host
116101
restart: always
117-
chaqna-aipc-ui-server:
102+
chatqna-aipc-ui-server:
118103
image: ${REGISTRY:-opea}/chatqna-ui:${TAG:-latest}
119104
container_name: chatqna-aipc-ui-server
120105
depends_on:
121-
- chaqna-aipc-backend-server
106+
- chatqna-aipc-backend-server
122107
ports:
123108
- "5173:5173"
124109
environment:
@@ -127,22 +112,22 @@ services:
127112
- http_proxy=${http_proxy}
128113
ipc: host
129114
restart: always
130-
chaqna-aipc-nginx-server:
115+
chatqna-aipc-nginx-server:
131116
image: ${REGISTRY:-opea}/nginx:${TAG:-latest}
132-
container_name: chaqna-aipc-nginx-server
117+
container_name: chatqna-aipc-nginx-server
133118
depends_on:
134-
- chaqna-aipc-backend-server
135-
- chaqna-aipc-ui-server
119+
- chatqna-aipc-backend-server
120+
- chatqna-aipc-ui-server
136121
ports:
137122
- "${NGINX_PORT:-80}:80"
138123
environment:
139124
- no_proxy=${no_proxy}
140125
- https_proxy=${https_proxy}
141126
- http_proxy=${http_proxy}
142-
- FRONTEND_SERVICE_IP=chatqna-xeon-ui-server
127+
- FRONTEND_SERVICE_IP=chatqna-aipc-ui-server
143128
- FRONTEND_SERVICE_PORT=5173
144129
- BACKEND_SERVICE_NAME=chatqna
145-
- BACKEND_SERVICE_IP=chatqna-xeon-backend-server
130+
- BACKEND_SERVICE_IP=chatqna-aipc-backend-server
146131
- BACKEND_SERVICE_PORT=8888
147132
- DATAPREP_SERVICE_IP=dataprep-redis-service
148133
- DATAPREP_SERVICE_PORT=6007

ChatQnA/docker_compose/intel/cpu/aipc/set_env.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,5 +16,5 @@ export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
1616
export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
1717
export RERANK_MODEL_ID="BAAI/bge-reranker-base"
1818
export INDEX_NAME="rag-redis"
19-
export OLLAMA_ENDPOINT=http://${host_ip}:11434
19+
export OLLAMA_HOST=${host_ip}
2020
export OLLAMA_MODEL="llama3.2"

ChatQnA/docker_compose/intel/cpu/xeon/README.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,6 @@ To set up environment variables for deploying ChatQnA services, follow these ste
1717
```bash
1818
# Example: host_ip="192.168.1.1"
1919
export host_ip="External_Public_IP"
20-
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
21-
export no_proxy="Your_No_Proxy"
2220
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
2321
```
2422

@@ -27,6 +25,9 @@ To set up environment variables for deploying ChatQnA services, follow these ste
2725
```bash
2826
export http_proxy="Your_HTTP_Proxy"
2927
export https_proxy="Your_HTTPs_Proxy"
28+
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
29+
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
30+
export no_proxy="Your_No_Proxy",chatqna-xeon-ui-server,chatqna-xeon-backend-server,dataprep-redis-service,tei-embedding-service,retriever,tei-reranking-service,tgi-service,vllm_service
3031
```
3132

3233
3. Set up other environment variables:
@@ -218,8 +219,6 @@ For users in China who are unable to download models directly from Huggingface,
218219
```bash
219220
# Example: host_ip="192.168.1.1"
220221
export host_ip="External_Public_IP"
221-
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
222-
export no_proxy="Your_No_Proxy"
223222
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
224223
# Example: NGINX_PORT=80
225224
export NGINX_PORT=${your_nginx_port}
@@ -230,6 +229,8 @@ For users in China who are unable to download models directly from Huggingface,
230229
```bash
231230
export http_proxy="Your_HTTP_Proxy"
232231
export https_proxy="Your_HTTPs_Proxy"
232+
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
233+
export no_proxy="Your_No_Proxy",chatqna-xeon-ui-server,chatqna-xeon-backend-server,dataprep-redis-service,tei-embedding-service,retriever,tei-reranking-service,tgi-service,vllm_service
233234
```
234235

235236
3. Set up other environment variables:

ChatQnA/docker_compose/intel/cpu/xeon/README_qdrant.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -167,10 +167,10 @@ export host_ip="External_Public_IP"
167167
export your_hf_api_token="Your_Huggingface_API_Token"
168168
```
169169

170-
**Append the value of the public IP address to the no_proxy list**
170+
**Append the value of the public IP address to the no_proxy list if you are in a proxy environment**
171171

172172
```
173-
export your_no_proxy=${your_no_proxy},"External_Public_IP"
173+
export your_no_proxy=${your_no_proxy},"External_Public_IP",chatqna-xeon-ui-server,chatqna-xeon-backend-server,dataprep-qdrant-service,tei-embedding-service,retriever,tei-reranking-service,tgi-service
174174
```
175175

176176
```bash

ChatQnA/docker_compose/intel/cpu/xeon/compose.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,7 @@ services:
112112
- RERANK_SERVER_PORT=${RERANK_SERVER_PORT:-80}
113113
- LLM_SERVER_HOST_IP=tgi-service
114114
- LLM_SERVER_PORT=${LLM_SERVER_PORT:-80}
115+
- LLM_MODEL=${LLM_MODEL_ID}
115116
- LOGFLAG=${LOGFLAG}
116117
ipc: host
117118
restart: always

ChatQnA/docker_compose/intel/cpu/xeon/compose_qdrant.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,7 @@ services:
111111
- RERANK_SERVER_PORT=${RERANK_SERVER_PORT:-80}
112112
- LLM_SERVER_HOST_IP=tgi-service
113113
- LLM_SERVER_PORT=${LLM_SERVER_PORT:-80}
114+
- LLM_MODEL=${LLM_MODEL_ID}
114115
- LOGFLAG=${LOGFLAG}
115116
ipc: host
116117
restart: always

ChatQnA/docker_compose/intel/cpu/xeon/compose_vllm.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,7 @@ services:
110110
- RERANK_SERVER_PORT=${RERANK_SERVER_PORT:-80}
111111
- LLM_SERVER_HOST_IP=vllm_service
112112
- LLM_SERVER_PORT=${LLM_SERVER_PORT:-80}
113+
- LLM_MODEL=${LLM_MODEL_ID}
113114
- LOGFLAG=${LOGFLAG}
114115
ipc: host
115116
restart: always

ChatQnA/docker_compose/intel/cpu/xeon/compose_without_rerank.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,7 @@ services:
9393
- RETRIEVER_SERVICE_HOST_IP=retriever
9494
- LLM_SERVER_HOST_IP=tgi-service
9595
- LLM_SERVER_PORT=${LLM_SERVER_PORT:-80}
96+
- LLM_MODEL=${LLM_MODEL_ID}
9697
- LOGFLAG=${LOGFLAG}
9798
ipc: host
9899
restart: always

ChatQnA/docker_compose/intel/hpu/gaudi/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,6 @@ To set up environment variables for deploying ChatQnA services, follow these ste
1717
```bash
1818
# Example: host_ip="192.168.1.1"
1919
export host_ip="External_Public_IP"
20-
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
21-
export no_proxy="Your_No_Proxy"
2220
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
2321
```
2422

@@ -27,6 +25,8 @@ To set up environment variables for deploying ChatQnA services, follow these ste
2725
```bash
2826
export http_proxy="Your_HTTP_Proxy"
2927
export https_proxy="Your_HTTPs_Proxy"
28+
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
29+
export no_proxy="Your_No_Proxy",chatqna-gaudi-ui-server,chatqna-gaudi-backend-server,dataprep-redis-service,tei-embedding-service,retriever,tei-reranking-service,tgi-service,vllm_service,vllm-ray-service,guardrails
3030
```
3131

3232
3. Set up other environment variables:
@@ -216,8 +216,6 @@ For users in China who are unable to download models directly from Huggingface,
216216
```bash
217217
# Example: host_ip="192.168.1.1"
218218
export host_ip="External_Public_IP"
219-
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
220-
export no_proxy="Your_No_Proxy"
221219
export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
222220
# Example: NGINX_PORT=80
223221
export NGINX_PORT=${your_nginx_port}
@@ -228,6 +226,8 @@ For users in China who are unable to download models directly from Huggingface,
228226
```bash
229227
export http_proxy="Your_HTTP_Proxy"
230228
export https_proxy="Your_HTTPs_Proxy"
229+
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
230+
export no_proxy="Your_No_Proxy",chatqna-gaudi-ui-server,chatqna-gaudi-backend-server,dataprep-redis-service,tei-embedding-service,retriever,tei-reranking-service,tgi-service,vllm_service,vllm-ray-service,guardrails
231231
```
232232

233233
3. Set up other environment variables:

0 commit comments

Comments
 (0)