-
Notifications
You must be signed in to change notification settings - Fork 282
Add HybridRAG as a new application in the GenAIExamples #1968
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jeanyu-habana
wants to merge
38
commits into
opea-project:main
Choose a base branch
from
jeanyu-habana:dev
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 10 commits
Commits
Show all changes
38 commits
Select commit
Hold shift + click to select a range
4514666
enable hybridrag pipeline
jeanyu-habana 46e51a1
change text2cypher port name
siddhivelankar23 fb3bfaa
add test script
jeanyu-habana 8a63852
Merge remote-tracking branch 'origin/dev' into dev
jeanyu-habana d5a9ad2
add ui
siddhivelankar23 15fbd1b
add README and update build/test scripts
jeanyu-habana 02952a9
update deployment README
jeanyu-habana da9658c
update to enable existing knowledge graph
jeanyu-habana 26795e9
Merge remote-tracking branch 'upstream/main' into dev
jeanyu-habana 03fd3b9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 91ef6d5
resolve precommit-ci error
jeanyu-habana 4fbd539
resolve merge conflicts
jeanyu-habana 611a5b2
fixed var name
jeanyu-habana f927076
skip txt files during codespell
jeanyu-habana fcb88fb
skip text files in codespell
jeanyu-habana ec6c0a8
clarify the purpose of data files
jeanyu-habana 752e970
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 803a17a
fix build env
jeanyu-habana 31e993f
Merge remote-tracking branch 'upstream/main' into dev
jeanyu-habana 92393c4
Merge remote-tracking branch 'origin/dev' into dev
jeanyu-habana 970121b
update build/test
jeanyu-habana 12020d3
debug
jeanyu-habana e041b2a
fixed port
jeanyu-habana d07c2da
updated env
jeanyu-habana 2a4a96c
debug
jeanyu-habana fe751df
debug
jeanyu-habana 30d51da
debug
jeanyu-habana 6ffeef2
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] bf173ae
debug
jeanyu-habana 65b5328
Merge remote-tracking branch 'upstream/main' into dev
jeanyu-habana a3715a8
debug
jeanyu-habana 11293a5
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 6c47bb6
debug
jeanyu-habana ef52eb7
Merge remote-tracking branch 'origin/dev' into dev
jeanyu-habana d20696a
debug
jeanyu-habana bcb5960
debug
jeanyu-habana 9f4be64
debug
jeanyu-habana ed5b2a1
debug
jeanyu-habana File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
# Copyright (C) 2025 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
ARG BASE_TAG=latest | ||
FROM opea/comps-base:$BASE_TAG | ||
|
||
COPY ./hybridrag.py $HOME/hybridrag.py | ||
|
||
ENTRYPOINT ["python", "hybridrag.py"] | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
# HybridRAG Application | ||
|
||
Enterprise AI systems require solutions that handle both structured data (databases, transactions, CSVs, JSON) and unstructured data (documents, images, audio). While traditional VectorRAG excels at semantic search across documents, it struggles with complex queries requiring global context or relationship-aware reasoning. HybridRAG application addresses these gaps by combining GraphRAG (knowledge graph-based retrieval) and VectorRAG (vector database retrieval) for enhanced accuracy and contextual relevance. | ||
|
||
## Table of contents | ||
|
||
1. [Architecture](#architecture) | ||
2. [Deployment](#deployment) | ||
|
||
## Architecture | ||
|
||
The HybridRAG application is a customizable end-to-end workflow that leverages the capabilities of LLMs and RAG efficiently. HybridRAG architecture is shown below: | ||
|
||
 | ||
|
||
This application is modular as it leverages each component as a microservice(as defined in [GenAIComps](https://github.com/opea-project/GenAIComps)) that can scale independently. It comprises data preparation, embedding, retrieval, reranker(optional) and LLM microservices. All these microservices are stitched together by the HybridRAG megaservice that orchestrates the data through these microservices. The flow chart below shows the information flow between different microservices for this example. | ||
|
||
```mermaid | ||
--- | ||
config: | ||
flowchart: | ||
nodeSpacing: 400 | ||
rankSpacing: 100 | ||
curve: linear | ||
themeVariables: | ||
fontSize: 50px | ||
--- | ||
flowchart LR | ||
%% Colors %% | ||
classDef blue fill:#ADD8E6,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5 | ||
classDef orange fill:#FBAA60,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5 | ||
classDef orchid fill:#C26DBC,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5 | ||
classDef invisible fill:transparent,stroke:transparent; | ||
style HybridRAG-MegaService stroke:#000000 | ||
|
||
%% Subgraphs %% | ||
subgraph HybridRAG-MegaService["HybridRAG MegaService "] | ||
direction LR | ||
EM([Embedding MicroService]):::blue | ||
RET([Retrieval MicroService]):::blue | ||
RER([Rerank MicroService]):::blue | ||
LLM([LLM MicroService]):::blue | ||
direction LR | ||
T2C([Text2Cypher MicroService]):::blue | ||
LLM([LLM MicroService]):::blue | ||
end | ||
subgraph UserInterface[" User Interface "] | ||
direction LR | ||
a([User Input Query]):::orchid | ||
UI([UI server<br>]):::orchid | ||
end | ||
|
||
|
||
|
||
TEI_RER{{Reranking service<br>}} | ||
TEI_EM{{Embedding service <br>}} | ||
VDB{{Vector DB<br><br>}} | ||
GDB{{Graph DB<br><br>}} | ||
R_RET{{Retriever service <br>}} | ||
DP([Data Preparation MicroService]):::blue | ||
S2G([Struct2Graph MicroService]):::blue | ||
LLM_gen{{LLM Service <br>}} | ||
GW([HybridRAG GateWay<br>]):::orange | ||
|
||
%% Questions interaction | ||
direction LR | ||
a[User Input Query] --> UI | ||
UI --> GW | ||
GW <==> HybridRAG-MegaService | ||
EM ==> RET | ||
RET ==> RER | ||
RER ==> LLM | ||
direction LR | ||
T2C ==> LLM | ||
|
||
|
||
%% Embedding service flow | ||
direction LR | ||
EM <-.-> TEI_EM | ||
RET <-.-> R_RET | ||
RER <-.-> TEI_RER | ||
LLM <-.-> LLM_gen | ||
|
||
direction TB | ||
%% Vector DB interaction | ||
R_RET <-.->|d|VDB | ||
DP <-.->|d|VDB | ||
|
||
direction TB | ||
%% Graph DB interaction | ||
T2C <-.->|d|GDB | ||
S2G <-.->|d|GDB | ||
|
||
``` | ||
|
||
## Deployment | ||
|
||
[HybridRAG deployment on Intel Gaudi](./docker_compose/intel/hpu/gaudi/README.md) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# Notice for FFmpeg: | ||
|
||
FFmpeg is an open source project licensed under LGPL and GPL. See https://www.ffmpeg.org/legal.html. You are solely responsible for determining if your use of FFmpeg requires any additional licenses. Intel is not responsible for obtaining any such licenses, nor liable for any licensing fees due, in connection with your use of FFmpeg. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,160 @@ | ||
# Example HybridRAG deployments on an Intel® Gaudi® Platform | ||
|
||
This example covers the single-node on-premises deployment of the HybridRAG example using OPEA components. There are various ways to enable HybridRAG, but this example will focus on four options available for deploying the HybridRAG pipeline to Intel® Gaudi® AI Accelerators. | ||
|
||
**Note** This example requires access to a properly installed Intel® Gaudi® platform with a functional Docker service configured to use the habanalabs-container-runtime. Please consult the [Intel® Gaudi® software Installation Guide](https://docs.habana.ai/en/v1.20.0/Installation_Guide/Driver_Installation.html) for more information. | ||
|
||
## HybridRAG Quick Start Deployment | ||
|
||
This section describes how to quickly deploy and test the HybridRAG service manually on an Intel® Gaudi® platform. The basic steps are: | ||
|
||
### Access the Code | ||
|
||
Clone the GenAIExample repository and access the HybridRAG Intel® Gaudi® platform Docker Compose files and supporting scripts: | ||
|
||
``` | ||
git clone https://github.com/opea-project/GenAIExamples.git | ||
cd GenAIExamples/HybridRAG/docker_compose/intel/hpu/gaudi/ | ||
``` | ||
|
||
Checkout a released version, such as v1.4: | ||
|
||
``` | ||
git checkout v1.4 | ||
``` | ||
|
||
### Generate a HuggingFace Access Token | ||
|
||
Some HuggingFace resources, such as some models, are only accessible if you have an access token. If you do not already have a HuggingFace access token, you can create one by first creating an account by following the steps provided at [HuggingFace](https://huggingface.co/) and then generating a [user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token). | ||
|
||
### Configure the Deployment Environment | ||
|
||
To set up environment variables for deploying HybridRAG services, source the _setup_env.sh_ script in this directory: | ||
|
||
``` | ||
source ./set_env.sh | ||
``` | ||
|
||
### Deploy the Services Using Docker Compose | ||
|
||
To deploy the HybridRAG services, execute the `docker compose up` command with the appropriate arguments. For a default deployment, execute: | ||
|
||
```bash | ||
docker compose up -d | ||
``` | ||
|
||
The HybridRAG docker images should automatically be downloaded from the `OPEA registry` and deployed on the Intel® Gaudi® Platform: | ||
|
||
``` | ||
[+] Running 9/9 | ||
✔ Container redis-vector-db Healthy 6.4s | ||
✔ Container vllm-service Started 0.4s | ||
✔ Container tei-embedding-server Started 0.9s | ||
✔ Container neo4j-apoc Healthy 11.4s | ||
✔ Container tei-reranking-server Started 0.8s | ||
✔ Container retriever-redis-server Started 1.0s | ||
✔ Container dataprep-redis-server Started 6.5s | ||
✔ Container text2cypher-gaudi-container Started 12.2s | ||
✔ Container hybridrag-xeon-backend-server Started 12.4s | ||
``` | ||
|
||
To rebuild the docker image for the hybridrag-xeon-backend-server container: | ||
|
||
``` | ||
cd GenAIExamples/HybridRAG | ||
docker build --no-cache -t opea/hybridrag:latest -f Dockerfile . | ||
``` | ||
|
||
### Check the Deployment Status | ||
|
||
After running docker compose, check if all the containers launched via docker compose have started: | ||
|
||
``` | ||
docker ps -a | ||
``` | ||
|
||
For the default deployment, the following 10 containers should have started: | ||
|
||
``` | ||
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES | ||
a9286abd0015 opea/hybridrag:latest "python hybridrag.py" 15 hours ago Up 15 hours 0.0.0.0:8888->8888/tcp, :::8888->8888/tcp hybridrag-xeon-backend-server | ||
8477b154dc72 opea/text2cypher-gaudi:latest "/bin/sh -c 'bash ru…" 15 hours ago Up 15 hours 0.0.0.0:11801->9097/tcp, [::]:11801->9097/tcp text2cypher-gaudi-container | ||
688e01a431fa opea/dataprep:latest "sh -c 'python $( [ …" 15 hours ago Up 15 hours 0.0.0.0:6007->5000/tcp, [::]:6007->5000/tcp dataprep-redis-server | ||
54f574fe54bb opea/retriever:latest "python opea_retriev…" 15 hours ago Up 15 hours 0.0.0.0:7000->7000/tcp, :::7000->7000/tcp retriever-redis-server | ||
5028eb66617c ghcr.io/huggingface/text-embeddings-inference:cpu-1.6 "text-embeddings-rou…" 15 hours ago Up 15 hours 0.0.0.0:8808->80/tcp, [::]:8808->80/tcp tei-reranking-server | ||
a9dbf8a13365 opea/vllm:latest "python3 -m vllm.ent…" 15 hours ago Up 15 hours (healthy) 0.0.0.0:9009->80/tcp, [::]:9009->80/tcp vllm-service | ||
43f44830f47b neo4j:latest "tini -g -- /startup…" 15 hours ago Up 15 hours (healthy) 0.0.0.0:7474->7474/tcp, :::7474->7474/tcp, 7473/tcp, 0.0.0.0:7687->7687/tcp, :::7687->7687/tcp neo4j-apoc | ||
867feabb6f11 redis/redis-stack:7.2.0-v9 "/entrypoint.sh" 15 hours ago Up 15 hours (healthy) 0.0.0.0:6379->6379/tcp, :::6379->6379/tcp, 0.0.0.0:8001->8001/tcp, :::8001->8001/tcp redis-vector-db | ||
23cd7f16453b ghcr.io/huggingface/text-embeddings-inference:cpu-1.6 "text-embeddings-rou…" 15 hours ago Up 15 hours 0.0.0.0:6006->80/tcp, [::]:6006->80/tcp tei-embedding-server | ||
``` | ||
|
||
### Test the Pipeline | ||
|
||
Once the HybridRAG services are running, run data ingestion. The following command is ingesting unstructure data: | ||
|
||
```bash | ||
cd GenAIExamples/HybridRAG/tests | ||
curl -X POST -H "Content-Type: multipart/form-data" \ | ||
-F "files=@./Diabetes.txt" \ | ||
-F "files=@./Acne_Vulgaris.txt" \ | ||
-F "chunk_size=300" \ | ||
-F "chunk_overlap=20" \ | ||
http://${host_ip}:6007/v1/dataprep/ingest | ||
``` | ||
|
||
By default, the application is pre-seeded with structured data and schema. To create knowledge graph with custom data and schema, | ||
set the cypher_insert environment variable prior to application deployment. Below is an example: | ||
|
||
```bash | ||
export cypher_insert=' | ||
LOAD CSV WITH HEADERS FROM "https://docs.google.com/spreadsheets/d/e/2PACX-1vQCEUxVlMZwwI2sn2T1aulBrRzJYVpsM9no8AEsYOOklCDTljoUIBHItGnqmAez62wwLpbvKMr7YoHI/pub?gid=0&single=true&output=csv" AS rows | ||
jeanyu-habana marked this conversation as resolved.
Show resolved
Hide resolved
|
||
MERGE (d:disease {name:rows.Disease}) | ||
MERGE (dt:diet {name:rows.Diet}) | ||
MERGE (d)-[:HOME_REMEDY]->(dt) | ||
|
||
MERGE (m:medication {name:rows.Medication}) | ||
MERGE (d)-[:TREATMENT]->(m) | ||
|
||
MERGE (s:symptoms {name:rows.Symptom}) | ||
MERGE (d)-[:MANIFESTATION]->(s) | ||
|
||
MERGE (p:precaution {name:rows.Precaution}) | ||
MERGE (d)-[:PREVENTION]->(p) | ||
' | ||
``` | ||
|
||
If the graph database is already populated, you can skip the knowledge graph generation by setting the refresh_db environment variable: | ||
|
||
```bash | ||
export refresh_db='False' | ||
``` | ||
|
||
Now test the pipeline using the following command: | ||
|
||
```bash | ||
curl -s -X POST -d '{"messages": "what are the symptoms for Diabetes?"}' \ | ||
-H 'Content-Type: application/json' \ | ||
"${host_ip}:8888/v1/hybridrag" | ||
``` | ||
|
||
To collect per request latency for the pipeline, run the following: | ||
|
||
```bash | ||
curl -o /dev/null -s -w "Total Time: %{time_total}s\n" \ | ||
-X POST \ | ||
-d '{"messages": "what are the symptoms for Diabetes?"}' \ | ||
-H 'Content-Type: application/json' \ | ||
"${host_ip}:8888/v1/hybridrag" | ||
``` | ||
|
||
**Note** The value of _host_ip_ was set using the _set_env.sh_ script and can be found in the _.env_ file. | ||
|
||
### Cleanup the Deployment | ||
|
||
To stop the containers associated with the deployment, execute the following command: | ||
|
||
``` | ||
docker compose -f compose.yaml down | ||
``` | ||
|
||
All the HybridRAG containers will be stopped and then removed on completion of the "down" command. |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this default pre-seeding refers to https://github.com/opea-project/GenAIComps/blob/d5db8826a2fea34bc6440bed53f45c9a5ac29e23/comps/text2cypher/src/integrations/cypher_utils.py#L20 ? or specific to this demo somewhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the test about pre-seeding is the part of the text2cypher, so yes it's here: https://github.com/opea-project/GenAIComps/blob/d5db8826a2fea34bc6440bed53f45c9a5ac29e23/comps/text2cypher/src/integrations/cypher_utils.py#L20