Skip to content

Commit 5eb3d28

Browse files
Update AgentQnA example for v1.1 release (#885)
Signed-off-by: minmin-intel <minmin.hou@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent ced68e1 commit 5eb3d28

17 files changed

+212
-104
lines changed

AgentQnA/README.md

Lines changed: 51 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -81,17 +81,13 @@ flowchart LR
8181
3. Hierarchical agent can further improve performance.
8282
Expert worker agents, such as retrieval agent, knowledge graph agent, SQL agent, etc., can provide high-quality output for different aspects of a complex query, and the supervisor agent can aggregate the information together to provide a comprehensive answer.
8383

84-
### Roadmap
84+
## Deployment with docker
8585

86-
- v0.9: Worker agent uses open-source websearch tool (duckduckgo), agents use OpenAI GPT-4o-mini as llm backend.
87-
- v1.0: Worker agent uses OPEA retrieval megaservice as tool.
88-
- v1.0 or later: agents use open-source llm backend.
89-
- v1.1 or later: add safeguards
86+
1. Build agent docker image
9087

91-
## Getting started
88+
Note: this is optional. The docker images will be automatically pulled when running the docker compose commands. This step is only needed if pulling images failed.
9289

93-
1. Build agent docker image </br>
94-
First, clone the opea GenAIComps repo
90+
First, clone the opea GenAIComps repo.
9591

9692
```
9793
export WORKDIR=<your-work-directory>
@@ -106,35 +102,63 @@ flowchart LR
106102
docker build -t opea/agent-langchain:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/agent/langchain/Dockerfile .
107103
```
108104

109-
2. Launch tool services </br>
110-
In this example, we will use some of the mock APIs provided in the Meta CRAG KDD Challenge to demonstrate the benefits of gaining additional context from mock knowledge graphs.
111-
112-
```
113-
docker run -d -p=8080:8000 docker.io/aicrowd/kdd-cup-24-crag-mock-api:v0
114-
```
115-
116-
3. Set up environment for this example </br>
117-
First, clone this repo
105+
2. Set up environment for this example </br>
106+
First, clone this repo.
118107

119108
```
120109
cd $WORKDIR
121110
git clone https://github.com/opea-project/GenAIExamples.git
122111
```
123112

124-
Second, set up env vars
113+
Second, set up env vars.
125114

126115
```
127116
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
128-
# optional: OPANAI_API_KEY
117+
# for using open-source llms
118+
export HUGGINGFACEHUB_API_TOKEN=<your-HF-token>
119+
export HF_CACHE_DIR=<directory-where-llms-are-downloaded> #so that no need to redownload every time
120+
121+
# optional: OPANAI_API_KEY if you want to use OpenAI models
129122
export OPENAI_API_KEY=<your-openai-key>
130123
```
131124

132-
4. Launch agent services</br>
133-
The configurations of the supervisor agent and the worker agent are defined in the docker-compose yaml file. We currently use openAI GPT-4o-mini as LLM, and we plan to add support for llama3.1-70B-instruct (served by TGI-Gaudi) in a subsequent release.
134-
To use openai llm, run command below.
125+
3. Deploy the retrieval tool (i.e., DocIndexRetriever mega-service)
126+
127+
First, launch the mega-service.
128+
129+
```
130+
cd $WORKDIR/GenAIExamples/AgentQnA/retrieval_tool
131+
bash launch_retrieval_tool.sh
132+
```
133+
134+
Then, ingest data into the vector database. Here we provide an example. You can ingest your own data.
135+
136+
```
137+
bash run_ingest_data.sh
138+
```
139+
140+
4. Launch other tools. </br>
141+
In this example, we will use some of the mock APIs provided in the Meta CRAG KDD Challenge to demonstrate the benefits of gaining additional context from mock knowledge graphs.
142+
143+
```
144+
docker run -d -p=8080:8000 docker.io/aicrowd/kdd-cup-24-crag-mock-api:v0
145+
```
146+
147+
5. Launch agent services</br>
148+
We provide two options for `llm_engine` of the agents: 1. open-source LLMs, 2. OpenAI models via API calls.
149+
150+
To use open-source LLMs on Gaudi2, run commands below.
151+
152+
```
153+
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi
154+
bash launch_tgi_gaudi.sh
155+
bash launch_agent_service_tgi_gaudi.sh
156+
```
157+
158+
To use OpenAI models, run commands below.
135159

136160
```
137-
cd docker_compose/intel/cpu/xeon
161+
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon
138162
bash launch_agent_service_openai.sh
139163
```
140164

@@ -143,10 +167,12 @@ flowchart LR
143167
First look at logs of the agent docker containers:
144168

145169
```
146-
docker logs docgrader-agent-endpoint
170+
# worker agent
171+
docker logs rag-agent-endpoint
147172
```
148173

149174
```
175+
# supervisor agent
150176
docker logs react-agent-endpoint
151177
```
152178

@@ -170,4 +196,4 @@ curl http://${ip_address}:9090/v1/chat/completions -X POST -H "Content-Type: app
170196

171197
## How to register your own tools with agent
172198

173-
You can take a look at the tools yaml and python files in this example. For more details, please refer to the "Provide your own tools" section in the instructions [here](https://github.com/opea-project/GenAIComps/tree/main/comps/agent/langchain/README.md#5-customize-agent-strategy).
199+
You can take a look at the tools yaml and python files in this example. For more details, please refer to the "Provide your own tools" section in the instructions [here](https://github.com/opea-project/GenAIComps/tree/main/comps/agent/langchain/README.md).
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Deployment on Xeon
2+
3+
We deploy the retrieval tool on Xeon. For LLMs, we support OpenAI models via API calls. For instructions on using open-source LLMs, please refer to the deployment guide [here](../../../../README.md).

AgentQnA/docker_compose/intel/cpu/xeon/compose_openai.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,10 @@
22
# SPDX-License-Identifier: Apache-2.0
33

44
services:
5-
worker-docgrader-agent:
5+
worker-rag-agent:
66
image: opea/agent-langchain:latest
7-
container_name: docgrader-agent-endpoint
7+
container_name: rag-agent-endpoint
88
volumes:
9-
- ${WORKDIR}/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
109
- ${TOOLSET_PATH}:/home/user/tools/
1110
ports:
1211
- "9095:9095"
@@ -36,8 +35,9 @@ services:
3635
supervisor-react-agent:
3736
image: opea/agent-langchain:latest
3837
container_name: react-agent-endpoint
38+
depends_on:
39+
- worker-rag-agent
3940
volumes:
40-
- ${WORKDIR}/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
4141
- ${TOOLSET_PATH}:/home/user/tools/
4242
ports:
4343
- "9090:9090"

AgentQnA/docker_compose/intel/cpu/xeon/launch_agent_service_openai.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ export recursion_limit_worker=12
77
export recursion_limit_supervisor=10
88
export model="gpt-4o-mini-2024-07-18"
99
export temperature=0
10-
export max_new_tokens=512
10+
export max_new_tokens=4096
1111
export OPENAI_API_KEY=${OPENAI_API_KEY}
1212
export WORKER_AGENT_URL="http://${ip_address}:9095/v1/chat/completions"
1313
export RETRIEVAL_TOOL_URL="http://${ip_address}:8889/v1/retrievaltool"

AgentQnA/docker_compose/intel/hpu/gaudi/compose.yaml

Lines changed: 5 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -2,37 +2,9 @@
22
# SPDX-License-Identifier: Apache-2.0
33

44
services:
5-
tgi-server:
6-
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
7-
container_name: tgi-server
8-
ports:
9-
- "8085:80"
10-
volumes:
11-
- ${HF_CACHE_DIR}:/data
12-
environment:
13-
no_proxy: ${no_proxy}
14-
http_proxy: ${http_proxy}
15-
https_proxy: ${https_proxy}
16-
HUGGING_FACE_HUB_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
17-
HF_HUB_DISABLE_PROGRESS_BARS: 1
18-
HF_HUB_ENABLE_HF_TRANSFER: 0
19-
HABANA_VISIBLE_DEVICES: all
20-
OMPI_MCA_btl_vader_single_copy_mechanism: none
21-
PT_HPU_ENABLE_LAZY_COLLECTIVES: true
22-
ENABLE_HPU_GRAPH: true
23-
LIMIT_HPU_GRAPH: true
24-
USE_FLASH_ATTENTION: true
25-
FLASH_ATTENTION_RECOMPUTE: true
26-
runtime: habana
27-
cap_add:
28-
- SYS_NICE
29-
ipc: host
30-
command: --model-id ${LLM_MODEL_ID} --max-input-length 4096 --max-total-tokens 8192 --sharded true --num-shard ${NUM_SHARDS}
31-
worker-docgrader-agent:
5+
worker-rag-agent:
326
image: opea/agent-langchain:latest
33-
container_name: docgrader-agent-endpoint
34-
depends_on:
35-
- tgi-server
7+
container_name: rag-agent-endpoint
368
volumes:
379
# - ${WORKDIR}/GenAIExamples/AgentQnA/docker_image_build/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
3810
- ${TOOLSET_PATH}:/home/user/tools/
@@ -41,7 +13,7 @@ services:
4113
ipc: host
4214
environment:
4315
ip_address: ${ip_address}
44-
strategy: rag_agent
16+
strategy: rag_agent_llama
4517
recursion_limit: ${recursion_limit_worker}
4618
llm_engine: tgi
4719
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
@@ -66,8 +38,7 @@ services:
6638
image: opea/agent-langchain:latest
6739
container_name: react-agent-endpoint
6840
depends_on:
69-
- tgi-server
70-
- worker-docgrader-agent
41+
- worker-rag-agent
7142
volumes:
7243
# - ${WORKDIR}/GenAIExamples/AgentQnA/docker_image_build/GenAIComps/comps/agent/langchain/:/home/user/comps/agent/langchain/
7344
- ${TOOLSET_PATH}:/home/user/tools/
@@ -76,7 +47,7 @@ services:
7647
ipc: host
7748
environment:
7849
ip_address: ${ip_address}
79-
strategy: react_langgraph
50+
strategy: react_llama
8051
recursion_limit: ${recursion_limit_supervisor}
8152
llm_engine: tgi
8253
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}

AgentQnA/docker_compose/intel/hpu/gaudi/launch_agent_service_tgi_gaudi.sh

Lines changed: 1 addition & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ export LLM_MODEL_ID="meta-llama/Meta-Llama-3.1-70B-Instruct"
1515
export NUM_SHARDS=4
1616
export LLM_ENDPOINT_URL="http://${ip_address}:8085"
1717
export temperature=0.01
18-
export max_new_tokens=512
18+
export max_new_tokens=4096
1919

2020
# agent related environment variables
2121
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
@@ -27,17 +27,3 @@ export RETRIEVAL_TOOL_URL="http://${ip_address}:8889/v1/retrievaltool"
2727
export CRAG_SERVER=http://${ip_address}:8080
2828

2929
docker compose -f compose.yaml up -d
30-
31-
sleep 5s
32-
echo "Waiting tgi gaudi ready"
33-
n=0
34-
until [[ "$n" -ge 100 ]] || [[ $ready == true ]]; do
35-
docker logs tgi-server &> tgi-gaudi-service.log
36-
n=$((n+1))
37-
if grep -q Connected tgi-gaudi-service.log; then
38-
break
39-
fi
40-
sleep 5s
41-
done
42-
sleep 5s
43-
echo "Service started successfully"
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# Copyright (C) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
# LLM related environment variables
5+
export HF_CACHE_DIR=${HF_CACHE_DIR}
6+
ls $HF_CACHE_DIR
7+
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
8+
export LLM_MODEL_ID="meta-llama/Meta-Llama-3.1-70B-Instruct"
9+
export NUM_SHARDS=4
10+
11+
docker compose -f tgi_gaudi.yaml up -d
12+
13+
sleep 5s
14+
echo "Waiting tgi gaudi ready"
15+
n=0
16+
until [[ "$n" -ge 100 ]] || [[ $ready == true ]]; do
17+
docker logs tgi-server &> tgi-gaudi-service.log
18+
n=$((n+1))
19+
if grep -q Connected tgi-gaudi-service.log; then
20+
break
21+
fi
22+
sleep 5s
23+
done
24+
sleep 5s
25+
echo "Service started successfully"
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# Copyright (C) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
services:
5+
tgi-server:
6+
image: ghcr.io/huggingface/tgi-gaudi:2.0.5
7+
container_name: tgi-server
8+
ports:
9+
- "8085:80"
10+
volumes:
11+
- ${HF_CACHE_DIR}:/data
12+
environment:
13+
no_proxy: ${no_proxy}
14+
http_proxy: ${http_proxy}
15+
https_proxy: ${https_proxy}
16+
HUGGING_FACE_HUB_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
17+
HF_HUB_DISABLE_PROGRESS_BARS: 1
18+
HF_HUB_ENABLE_HF_TRANSFER: 0
19+
HABANA_VISIBLE_DEVICES: all
20+
OMPI_MCA_btl_vader_single_copy_mechanism: none
21+
PT_HPU_ENABLE_LAZY_COLLECTIVES: true
22+
ENABLE_HPU_GRAPH: true
23+
LIMIT_HPU_GRAPH: true
24+
USE_FLASH_ATTENTION: true
25+
FLASH_ATTENTION_RECOMPUTE: true
26+
runtime: habana
27+
cap_add:
28+
- SYS_NICE
29+
ipc: host
30+
command: --model-id ${LLM_MODEL_ID} --max-input-length 4096 --max-total-tokens 8192 --sharded true --num-shard ${NUM_SHARDS}

AgentQnA/tests/4_launch_and_validate_agent_tgi.sh renamed to AgentQnA/tests/step4_launch_and_validate_agent_tgi.sh

Lines changed: 23 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,12 @@ if [ ! -d "$HF_CACHE_DIR" ]; then
1717
fi
1818
ls $HF_CACHE_DIR
1919

20+
function start_tgi(){
21+
echo "Starting tgi-gaudi server"
22+
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi
23+
bash launch_tgi_gaudi.sh
24+
25+
}
2026

2127
function start_agent_and_api_server() {
2228
echo "Starting CRAG server"
@@ -25,6 +31,7 @@ function start_agent_and_api_server() {
2531
echo "Starting Agent services"
2632
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi
2733
bash launch_agent_service_tgi_gaudi.sh
34+
sleep 10
2835
}
2936

3037
function validate() {
@@ -43,18 +50,22 @@ function validate() {
4350

4451
function validate_agent_service() {
4552
echo "----------------Test agent ----------------"
46-
local CONTENT=$(http_proxy="" curl http://${ip_address}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
47-
"query": "Tell me about Michael Jackson song thriller"
48-
}')
49-
local EXIT_CODE=$(validate "$CONTENT" "Thriller" "react-agent-endpoint")
50-
docker logs docgrader-agent-endpoint
53+
# local CONTENT=$(http_proxy="" curl http://${ip_address}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
54+
# "query": "Tell me about Michael Jackson song thriller"
55+
# }')
56+
export agent_port="9095"
57+
local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py)
58+
local EXIT_CODE=$(validate "$CONTENT" "Thriller" "rag-agent-endpoint")
59+
docker logs rag-agent-endpoint
5160
if [ "$EXIT_CODE" == "1" ]; then
5261
exit 1
5362
fi
5463

55-
local CONTENT=$(http_proxy="" curl http://${ip_address}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
56-
"query": "Tell me about Michael Jackson song thriller"
57-
}')
64+
# local CONTENT=$(http_proxy="" curl http://${ip_address}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
65+
# "query": "Tell me about Michael Jackson song thriller"
66+
# }')
67+
export agent_port="9090"
68+
local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py)
5869
local EXIT_CODE=$(validate "$CONTENT" "Thriller" "react-agent-endpoint")
5970
docker logs react-agent-endpoint
6071
if [ "$EXIT_CODE" == "1" ]; then
@@ -64,6 +75,10 @@ function validate_agent_service() {
6475
}
6576

6677
function main() {
78+
echo "==================== Start TGI ===================="
79+
start_tgi
80+
echo "==================== TGI started ===================="
81+
6782
echo "==================== Start agent ===================="
6883
start_agent_and_api_server
6984
echo "==================== Agent started ===================="

AgentQnA/tests/test.py

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# Copyright (C) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
import os
5+
6+
import requests
7+
8+
9+
def generate_answer_agent_api(url, prompt):
10+
proxies = {"http": ""}
11+
payload = {
12+
"query": prompt,
13+
}
14+
response = requests.post(url, json=payload, proxies=proxies)
15+
answer = response.json()["text"]
16+
return answer
17+
18+
19+
if __name__ == "__main__":
20+
ip_address = os.getenv("ip_address", "localhost")
21+
agent_port = os.getenv("agent_port", "9095")
22+
url = f"http://{ip_address}:{agent_port}/v1/chat/completions"
23+
prompt = "Tell me about Michael Jackson song thriller"
24+
answer = generate_answer_agent_api(url, prompt)
25+
print(answer)

0 commit comments

Comments
 (0)