Skip to content

Commit 8df3088

Browse files
authored
Merge branch 'main' into mmqna-phase3
2 parents 307675f + 2764a6d commit 8df3088

File tree

10 files changed

+299
-37
lines changed

10 files changed

+299
-37
lines changed

AgentQnA/README.md

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,13 @@
11
# Agents for Question Answering
22

3+
## Table of contents
4+
5+
1. [Overview](#overview)
6+
2. [Deploy with Docker](#deploy-with-docker)
7+
3. [Launch the UI](#launch-the-ui)
8+
4. [Validate Services](#validate-services)
9+
5. [Register Tools](#how-to-register-other-tools-with-the-ai-agent)
10+
311
## Overview
412

513
This example showcases a hierarchical multi-agent system for question-answering applications. The architecture diagram below shows a supervisor agent that interfaces with the user and dispatches tasks to two worker agents to gather information and come up with answers. The worker RAG agent uses the retrieval tool to retrieve relevant documents from a knowledge base - a vector database. The worker SQL agent retrieves relevant data from a SQL database. Although not included in this example by default, other tools such as a web search tool or a knowledge graph query tool can be used by the supervisor agent to gather information from additional sources.
@@ -134,7 +142,7 @@ source $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi/set_env.sh
134142
source $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon/set_env.sh
135143
```
136144

137-
### 3. Launch the multi-agent system. </br>
145+
### 2. Launch the multi-agent system. </br>
138146

139147
Two options are provided for the `llm_engine` of the agents: 1. open-source LLMs on Gaudi, 2. OpenAI models via API calls.
140148

@@ -151,11 +159,19 @@ cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi/
151159
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose.yaml up -d
152160
```
153161

162+
To enable Open Telemetry Tracing, compose.telemetry.yaml file need to be merged along with default compose.yaml file.
163+
Gaudi example with Open Telemetry feature:
164+
165+
```bash
166+
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi/
167+
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose.yaml -f compose.telemetry.yaml up -d
168+
```
169+
154170
##### [Optional] Web Search Tool Support
155171

156172
<details>
157173
<summary> Instructions </summary>
158-
A web search tool is supported in this example and can be enabled by running docker compose with the `compose.webtool.yaml` file.
174+
A web search tool is supported in this example and can be enabled by running docker compose with the `compose.webtool.yaml` file.
159175
The Google Search API is used. Follow the [instructions](https://python.langchain.com/docs/integrations/tools/google_search) to create an API key and enable the Custom Search API on a Google account. The environment variables `GOOGLE_CSE_ID` and `GOOGLE_API_KEY` need to be set.
160176

161177
```bash
@@ -179,7 +195,7 @@ cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon
179195
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose_openai.yaml up -d
180196
```
181197

182-
### 4. Ingest Data into the vector database
198+
### 3. Ingest Data into the vector database
183199

184200
The `run_ingest_data.sh` script will use an example jsonl file to ingest example documents into a vector database. Other ways to ingest data and other types of documents supported can be found in the OPEA dataprep microservice located in the opea-project/GenAIComps repo.
185201

AgentQnA/docker_compose/amd/gpu/rocm/README.md

Lines changed: 38 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -211,11 +211,18 @@ All containers should be running and should not restart:
211211

212212
##### If you use TGI:
213213

214-
- agentqna-tgi-service
215-
- whisper-service
216-
- speecht5-service
217-
- agentqna-backend-server
218-
- agentqna-ui-server
214+
- dataprep-redis-server
215+
- doc-index-retriever-server
216+
- embedding-server
217+
- rag-agent-endpoint
218+
- react-agent-endpoint
219+
- redis-vector-db
220+
- reranking-tei-xeon-server
221+
- retriever-redis-server
222+
- sql-agent-endpoint
223+
- tei-embedding-server
224+
- tei-reranking-server
225+
- tgi-service
219226

220227
---
221228

@@ -229,7 +236,7 @@ All containers should be running and should not restart:
229236
DATA='{"model": "Intel/neural-chat-7b-v3-3t", '\
230237
'"messages": [{"role": "user", "content": "What is Deep Learning?"}], "max_tokens": 256}'
231238
232-
curl http://${HOST_IP}:${AUDIOQNA_VLLM_SERVICE_PORT}/v1/chat/completions \
239+
curl http://${HOST_IP}:${VLLM_SERVICE_PORT}/v1/chat/completions \
233240
-X POST \
234241
-d "$DATA" \
235242
-H 'Content-Type: application/json'
@@ -270,7 +277,7 @@ then we consider the vLLM service to be successfully launched
270277
DATA='{"inputs":"What is Deep Learning?",'\
271278
'"parameters":{"max_new_tokens":256,"do_sample": true}}'
272279
273-
curl http://${HOST_IP}:${AUDIOQNA_TGI_SERVICE_PORT}/generate \
280+
curl http://${HOST_IP}:${TGI_SERVICE_PORT}/generate \
274281
-X POST \
275282
-d "$DATA" \
276283
-H 'Content-Type: application/json'
@@ -287,48 +294,49 @@ Checking the response from the service. The response should be similar to JSON:
287294
If the service response has a meaningful response in the value of the "generated_text" key,
288295
then we consider the TGI service to be successfully launched
289296

290-
### 2. Validate MegaServices
297+
### 2. Validate Agent Services
291298

292-
Test the AgentQnA megaservice by recording a .wav file, encoding the file into the base64 format, and then sending the
293-
base64 string to the megaservice endpoint. The megaservice will return a spoken response as a base64 string. To listen
294-
to the response, decode the base64 string and save it as a .wav file.
299+
#### Validate Rag Agent Service
295300

296301
```bash
297-
# voice can be "default" or "male"
298-
curl http://${host_ip}:3008/v1/agentqna \
299-
-X POST \
300-
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA", "max_tokens":64, "voice":"default"}' \
301-
-H 'Content-Type: application/json' | sed 's/^"//;s/"$//' | base64 -d > output.wav
302+
export agent_port=${WORKER_RAG_AGENT_PORT}
303+
prompt="Tell me about Michael Jackson song Thriller"
304+
python3 ~/agentqna-install/GenAIExamples/AgentQnA/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port
302305
```
303306

304-
### 3. Validate MicroServices
307+
The response must contain the meaningful text of the response to the request from the "prompt" variable
308+
309+
#### Validate Sql Agent Service
305310

306311
```bash
307-
# whisper service
308-
curl http://${host_ip}:7066/v1/asr \
309-
-X POST \
310-
-d '{"audio": "UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA"}' \
311-
-H 'Content-Type: application/json'
312+
export agent_port=${WORKER_SQL_AGENT_PORT}
313+
prompt="How many employees are there in the company?"
314+
python3 ~/agentqna-install/GenAIExamples/AgentQnA/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port
315+
```
312316

313-
# speecht5 service
314-
curl http://${host_ip}:7055/v1/tts \
315-
-X POST \
316-
-d '{"text": "Who are you?"}' \
317-
-H 'Content-Type: application/json'
317+
The answer should make sense - "8 employees in the company"
318+
319+
#### Validate React (Supervisor) Agent Service
320+
321+
```bash
322+
export agent_port=${SUPERVISOR_REACT_AGENT_PORT}
323+
python3 ~/agentqna-install/GenAIExamples/AgentQnA/tests/test.py --agent_role "supervisor" --ext_port $agent_port --stream
318324
```
319325

320-
### 4. Stop application
326+
The response should contain "Iron Maiden"
327+
328+
### 3. Stop application
321329

322330
#### If you use vLLM
323331

324332
```bash
325333
cd ~/agentqna-install/GenAIExamples/AgentQnA/docker_compose/amd/gpu/rocm
326-
docker compose -f compose_vllm.yaml down
334+
bash stop_agent_service_vllm_rocm.sh
327335
```
328336

329337
#### If you use TGI
330338

331339
```bash
332340
cd ~/agentqna-install/GenAIExamples/AgentQnA/docker_compose/amd/gpu/rocm
333-
docker compose -f compose.yaml down
341+
bash stop_agent_service_tgi_rocm.sh
334342
```
Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
# Copyright (C) 2025 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
services:
5+
tei-embedding-service:
6+
command: --model-id ${EMBEDDING_MODEL_ID} --auto-truncate --otlp-endpoint $OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
7+
tei-reranking-service:
8+
command: --model-id ${RERANK_MODEL_ID} --auto-truncate --otlp-endpoint $OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
9+
jaeger:
10+
image: jaegertracing/all-in-one:1.67.0
11+
container_name: jaeger
12+
ports:
13+
- "16686:16686"
14+
- "4317:4317"
15+
- "4318:4318"
16+
- "9411:9411"
17+
ipc: host
18+
environment:
19+
no_proxy: ${no_proxy}
20+
http_proxy: ${http_proxy}
21+
https_proxy: ${https_proxy}
22+
COLLECTOR_ZIPKIN_HOST_PORT: 9411
23+
restart: unless-stopped
24+
prometheus:
25+
image: prom/prometheus:v2.52.0
26+
container_name: prometheus
27+
user: root
28+
volumes:
29+
- ./prometheus.yaml:/etc/prometheus/prometheus.yaml
30+
- ./prometheus_data:/prometheus
31+
command:
32+
- '--config.file=/etc/prometheus/prometheus.yaml'
33+
ports:
34+
- '9091:9090'
35+
ipc: host
36+
restart: unless-stopped
37+
grafana:
38+
image: grafana/grafana:11.0.0
39+
container_name: grafana
40+
volumes:
41+
- ./grafana_data:/var/lib/grafana
42+
- ./grafana/dashboards:/var/lib/grafana/dashboards
43+
- ./grafana/provisioning:/etc/grafana/provisioning
44+
user: root
45+
environment:
46+
GF_SECURITY_ADMIN_PASSWORD: admin
47+
GF_RENDERING_CALLBACK_URL: http://grafana:3000/
48+
GF_LOG_FILTERS: rendering:debug
49+
depends_on:
50+
- prometheus
51+
ports:
52+
- '3000:3000'
53+
ipc: host
54+
restart: unless-stopped
55+
node-exporter:
56+
image: prom/node-exporter
57+
container_name: node-exporter
58+
volumes:
59+
- /proc:/host/proc:ro
60+
- /sys:/host/sys:ro
61+
- /:/rootfs:ro
62+
command:
63+
- '--path.procfs=/host/proc'
64+
- '--path.sysfs=/host/sys'
65+
- --collector.filesystem.ignored-mount-points
66+
- "^/(sys|proc|dev|host|etc|rootfs/var/lib/docker/containers|rootfs/var/lib/docker/overlay2|rootfs/run/docker/netns|rootfs/var/lib/docker/aufs)($$|/)"
67+
ports:
68+
- 9100:9100
69+
restart: always
70+
deploy:
71+
mode: global
72+
gaudi-exporter:
73+
image: vault.habana.ai/gaudi-metric-exporter/metric-exporter:1.19.2-32
74+
container_name: gaudi-exporter
75+
volumes:
76+
- /proc:/host/proc:ro
77+
- /sys:/host/sys:ro
78+
- /:/rootfs:ro
79+
- /dev:/dev
80+
ports:
81+
- 41612:41611
82+
restart: always
83+
deploy:
84+
mode: global
85+
worker-rag-agent:
86+
environment:
87+
- TELEMETRY_ENDPOINT=${TELEMETRY_ENDPOINT}
88+
worker-sql-agent:
89+
environment:
90+
- TELEMETRY_ENDPOINT=${TELEMETRY_ENDPOINT}
91+
supervisor-react-agent:
92+
environment:
93+
- TELEMETRY_ENDPOINT=${TELEMETRY_ENDPOINT}
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# Copyright (C) 2025 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
rm *.json
5+
wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/chatqna_megaservice_grafana.json
6+
mv chatqna_megaservice_grafana.json agentqna_microervices_grafana.json
7+
wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/vllm_grafana.json
8+
wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/tgi_grafana.json
9+
wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/node_grafana.json
10+
wget https://raw.githubusercontent.com/opea-project/GenAIEval/refs/heads/main/evals/benchmark/grafana/gaudi_grafana.json
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# Copyright (C) 2025 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
apiVersion: 1
5+
6+
providers:
7+
- name: 'default'
8+
orgId: 1
9+
folder: ''
10+
type: file
11+
disableDeletion: false
12+
updateIntervalSeconds: 10 #how often Grafana will scan for changed dashboards
13+
options:
14+
path: /var/lib/grafana/dashboards
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# Copyright (C) 2025 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
# config file version
5+
apiVersion: 1
6+
7+
# list of datasources that should be deleted from the database
8+
deleteDatasources:
9+
- name: Prometheus
10+
orgId: 1
11+
12+
# list of datasources to insert/update depending
13+
# what's available in the database
14+
datasources:
15+
# <string, required> name of the datasource. Required
16+
- name: Prometheus
17+
# <string, required> datasource type. Required
18+
type: prometheus
19+
# <string, required> access mode. direct or proxy. Required
20+
access: proxy
21+
# <int> org id. will default to orgId 1 if not specified
22+
orgId: 1
23+
# <string> url
24+
url: http://prometheus:9090
25+
# <string> database password, if used
26+
password:
27+
# <string> database user, if used
28+
user:
29+
# <string> database name, if used
30+
database:
31+
# <bool> enable/disable basic auth
32+
basicAuth: false
33+
# <string> basic auth username, if used
34+
basicAuthUser:
35+
# <string> basic auth password, if used
36+
basicAuthPassword:
37+
# <bool> enable/disable with credentials headers
38+
withCredentials:
39+
# <bool> mark as default datasource. Max one per org
40+
isDefault: true
41+
# <map> fields that will be converted to json and stored in json_data
42+
jsonData:
43+
httpMethod: GET
44+
graphiteVersion: "1.1"
45+
tlsAuth: false
46+
tlsAuthWithCACert: false
47+
# <string> json object of data that will be encrypted.
48+
secureJsonData:
49+
tlsCACert: "..."
50+
tlsClientCert: "..."
51+
tlsClientKey: "..."
52+
version: 1
53+
# <bool> allow users to edit datasources from the UI.
54+
editable: true
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# Copyright (C) 2025 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
global:
5+
scrape_interval: 5s
6+
external_labels:
7+
monitor: "my-monitor"
8+
scrape_configs:
9+
- job_name: "prometheus"
10+
static_configs:
11+
- targets: ["prometheus:9090"]
12+
- job_name: "vllm"
13+
metrics_path: /metrics
14+
static_configs:
15+
- targets: ["vllm-gaudi-server:8000"]
16+
- job_name: "tgi"
17+
metrics_path: /metrics
18+
static_configs:
19+
- targets: ["tgi-gaudi-server:80"]
20+
- job_name: "tei-embedding"
21+
metrics_path: /metrics
22+
static_configs:
23+
- targets: ["tei-embedding-server:80"]
24+
- job_name: "tei-reranking"
25+
metrics_path: /metrics
26+
static_configs:
27+
- targets: ["tei-reranking-server:80"]
28+
- job_name: "retriever"
29+
metrics_path: /metrics
30+
static_configs:
31+
- targets: ["retriever:7000"]
32+
- job_name: "dataprep-redis-service"
33+
metrics_path: /metrics
34+
static_configs:
35+
- targets: ["dataprep-redis-service:5000"]
36+
- job_name: "prometheus-node-exporter"
37+
metrics_path: /metrics
38+
static_configs:
39+
- targets: ["node-exporter:9100"]
40+
- job_name: "prometheus-gaudi-exporter"
41+
metrics_path: /metrics
42+
static_configs:
43+
- targets: ["gaudi-exporter:41611"]
44+
- job_name: "supervisor-react-agent"
45+
metrics_path: /metrics
46+
static_configs:
47+
- targets: ["react-agent-endpoint:9090"]
48+
- job_name: "worker-rag-agent"
49+
metrics_path: /metrics
50+
static_configs:
51+
- targets: ["rag-agent-endpoint:9095"]
52+
- job_name: "worker-sql-agent"
53+
metrics_path: /metrics
54+
static_configs:
55+
- targets: ["sql-agent-endpoint:9096"]

0 commit comments

Comments
 (0)