|
2 | 2 |
|
3 | 3 | This document outlines the deployment process for a ChatQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on AIPC. The steps include Docker image creation, container deployment via Docker Compose, and service execution to integrate microservices such as `embedding`, `retriever`, `rerank`, and `llm`.
|
4 | 4 |
|
5 |
| -## Prerequisites |
6 |
| - |
7 |
| -We use [Ollama](https://ollama.com/) as our LLM service for AIPC. |
8 |
| - |
9 |
| -Please follow the instructions to set up Ollama on your PC. This will set the entrypoint needed for the Ollama to suit the ChatQnA examples. |
10 |
| - |
11 |
| -### Set Up Ollama LLM Service |
12 |
| - |
13 |
| -#### Install Ollama Service |
14 |
| - |
15 |
| -Install Ollama service with one command: |
16 |
| - |
17 |
| -``` |
18 |
| -curl -fsSL https://ollama.com/install.sh | sh |
19 |
| -``` |
20 |
| - |
21 |
| -#### Set Ollama Service Configuration |
22 |
| - |
23 |
| -Ollama Service Configuration file is /etc/systemd/system/ollama.service. Edit the file to set OLLAMA_HOST environment. |
24 |
| -Replace **<host_ip>** with your host IPV4 (please use external public IP). For example the host_ip is 10.132.x.y, then `Environment="OLLAMA_HOST=10.132.x.y:11434"'. |
25 |
| - |
26 |
| -``` |
27 |
| -Environment="OLLAMA_HOST=host_ip:11434" |
28 |
| -``` |
29 |
| - |
30 |
| -#### Set https_proxy environment for Ollama |
31 |
| - |
32 |
| -If your system access network through proxy, add https_proxy in Ollama Service Configuration file |
33 |
| - |
34 |
| -``` |
35 |
| -Environment="https_proxy=Your_HTTPS_Proxy" |
36 |
| -``` |
37 |
| - |
38 |
| -#### Restart Ollama services |
39 |
| - |
40 |
| -``` |
41 |
| -$ sudo systemctl daemon-reload |
42 |
| -$ sudo systemctl restart ollama.service |
43 |
| -``` |
44 |
| - |
45 |
| -#### Check the service started |
46 |
| - |
47 |
| -``` |
48 |
| -netstat -tuln | grep 11434 |
49 |
| -``` |
50 |
| - |
51 |
| -The output are: |
52 |
| - |
53 |
| -``` |
54 |
| -tcp 0 0 10.132.x.y:11434 0.0.0.0:* LISTEN |
55 |
| -``` |
56 |
| - |
57 |
| -#### Pull Ollama LLM model |
58 |
| - |
59 |
| -Run the command to download LLM models. The <host_ip> is the one set in [Ollama Service Configuration](#Set-Ollama-Service-Configuration) |
60 |
| - |
61 |
| -``` |
62 |
| -export host_ip=<host_ip> |
63 |
| -export OLLAMA_HOST=http://${host_ip}:11434 |
64 |
| -ollama pull llama3.2 |
65 |
| -``` |
66 |
| - |
67 |
| -After downloaded the models, you can list the models by `ollama list`. |
68 |
| - |
69 |
| -The output should be similar to the following: |
70 |
| - |
71 |
| -``` |
72 |
| -NAME ID SIZE MODIFIED |
73 |
| -llama3.2:latest a80c4f17acd5 2.0 GB 2 minutes ago |
74 |
| -``` |
75 |
| - |
76 |
| -### Consume Ollama LLM Service |
77 |
| - |
78 |
| -Access ollama service to verify that the ollama is functioning correctly. |
79 |
| - |
80 |
| -```bash |
81 |
| -curl http://${host_ip}:11434/v1/chat/completions \ |
82 |
| - -H "Content-Type: application/json" \ |
83 |
| - -d '{ |
84 |
| - "model": "llama3.2", |
85 |
| - "messages": [ |
86 |
| - { |
87 |
| - "role": "system", |
88 |
| - "content": "You are a helpful assistant." |
89 |
| - }, |
90 |
| - { |
91 |
| - "role": "user", |
92 |
| - "content": "Hello!" |
93 |
| - } |
94 |
| - ] |
95 |
| - }' |
96 |
| -``` |
97 |
| - |
98 |
| -The outputs are similar to these: |
99 |
| - |
100 |
| -``` |
101 |
| -{"id":"chatcmpl-4","object":"chat.completion","created":1729232496,"model":"llama3.2","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"How can I assist you today? Are you looking for information, answers to a question, or just need someone to chat with? I'm here to help in any way I can."},"finish_reason":"stop"}],"usage":{"prompt_tokens":33,"completion_tokens":38,"total_tokens":71}} |
102 |
| -``` |
103 |
| - |
104 | 5 | ## 🚀 Build Docker Images
|
105 | 6 |
|
106 | 7 | First of all, you need to build Docker Images locally and install the python package of it.
|
|
0 commit comments