Skip to content

Add HybridRAG as a new application in the GenAIExamples #1968

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 38 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
4514666
enable hybridrag pipeline
jeanyu-habana May 9, 2025
46e51a1
change text2cypher port name
siddhivelankar23 May 13, 2025
fb3bfaa
add test script
jeanyu-habana May 14, 2025
8a63852
Merge remote-tracking branch 'origin/dev' into dev
jeanyu-habana May 14, 2025
d5a9ad2
add ui
siddhivelankar23 May 14, 2025
15fbd1b
add README and update build/test scripts
jeanyu-habana May 15, 2025
02952a9
update deployment README
jeanyu-habana May 15, 2025
da9658c
update to enable existing knowledge graph
jeanyu-habana May 15, 2025
26795e9
Merge remote-tracking branch 'upstream/main' into dev
jeanyu-habana May 15, 2025
03fd3b9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 15, 2025
91ef6d5
resolve precommit-ci error
jeanyu-habana May 15, 2025
4fbd539
resolve merge conflicts
jeanyu-habana May 15, 2025
611a5b2
fixed var name
jeanyu-habana May 15, 2025
f927076
skip txt files during codespell
jeanyu-habana May 15, 2025
fcb88fb
skip text files in codespell
jeanyu-habana May 16, 2025
ec6c0a8
clarify the purpose of data files
jeanyu-habana May 16, 2025
752e970
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 16, 2025
803a17a
fix build env
jeanyu-habana May 16, 2025
31e993f
Merge remote-tracking branch 'upstream/main' into dev
jeanyu-habana May 16, 2025
92393c4
Merge remote-tracking branch 'origin/dev' into dev
jeanyu-habana May 16, 2025
970121b
update build/test
jeanyu-habana May 16, 2025
12020d3
debug
jeanyu-habana May 16, 2025
e041b2a
fixed port
jeanyu-habana May 17, 2025
d07c2da
updated env
jeanyu-habana May 17, 2025
2a4a96c
debug
jeanyu-habana May 17, 2025
fe751df
debug
jeanyu-habana May 17, 2025
30d51da
debug
jeanyu-habana May 19, 2025
6ffeef2
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 19, 2025
bf173ae
debug
jeanyu-habana May 19, 2025
65b5328
Merge remote-tracking branch 'upstream/main' into dev
jeanyu-habana May 19, 2025
a3715a8
debug
jeanyu-habana May 19, 2025
11293a5
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 19, 2025
6c47bb6
debug
jeanyu-habana May 19, 2025
ef52eb7
Merge remote-tracking branch 'origin/dev' into dev
jeanyu-habana May 19, 2025
d20696a
debug
jeanyu-habana May 19, 2025
bcb5960
debug
jeanyu-habana May 19, 2025
9f4be64
debug
jeanyu-habana May 19, 2025
ed5b2a1
debug
jeanyu-habana May 19, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions HybridRAG/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

ARG BASE_TAG=latest
FROM opea/comps-base:$BASE_TAG

COPY ./hybridrag.py $HOME/hybridrag.py

ENTRYPOINT ["python", "hybridrag.py"]

98 changes: 98 additions & 0 deletions HybridRAG/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# HybridRAG Application

Enterprise AI systems require solutions that handle both structured data (databases, transactions, CSVs, JSON) and unstructured data (documents, images, audio). While traditional VectorRAG excels at semantic search across documents, it struggles with complex queries requiring global context or relationship-aware reasoning. HybridRAG application addresses these gaps by combining GraphRAG (knowledge graph-based retrieval) and VectorRAG (vector database retrieval) for enhanced accuracy and contextual relevance.

## Table of contents

1. [Architecture](#architecture)
2. [Deployment](#deployment)

## Architecture

The HybridRAG application is a customizable end-to-end workflow that leverages the capabilities of LLMs and RAG efficiently. HybridRAG architecture is shown below:

![architecture](./assets/img/hybridrag_retriever_architecture.png)

This application is modular as it leverages each component as a microservice(as defined in [GenAIComps](https://github.com/opea-project/GenAIComps)) that can scale independently. It comprises data preparation, embedding, retrieval, reranker(optional) and LLM microservices. All these microservices are stitched together by the HybridRAG megaservice that orchestrates the data through these microservices. The flow chart below shows the information flow between different microservices for this example.

```mermaid
---
config:
flowchart:
nodeSpacing: 400
rankSpacing: 100
curve: linear
themeVariables:
fontSize: 50px
---
flowchart LR
%% Colors %%
classDef blue fill:#ADD8E6,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef orange fill:#FBAA60,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef orchid fill:#C26DBC,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef invisible fill:transparent,stroke:transparent;
style HybridRAG-MegaService stroke:#000000

%% Subgraphs %%
subgraph HybridRAG-MegaService["HybridRAG MegaService "]
direction LR
EM([Embedding MicroService]):::blue
RET([Retrieval MicroService]):::blue
RER([Rerank MicroService]):::blue
LLM([LLM MicroService]):::blue
direction LR
T2C([Text2Cypher MicroService]):::blue
LLM([LLM MicroService]):::blue
end
subgraph UserInterface[" User Interface "]
direction LR
a([User Input Query]):::orchid
UI([UI server<br>]):::orchid
end



TEI_RER{{Reranking service<br>}}
TEI_EM{{Embedding service <br>}}
VDB{{Vector DB<br><br>}}
GDB{{Graph DB<br><br>}}
R_RET{{Retriever service <br>}}
DP([Data Preparation MicroService]):::blue
S2G([Struct2Graph MicroService]):::blue
LLM_gen{{LLM Service <br>}}
GW([HybridRAG GateWay<br>]):::orange

%% Questions interaction
direction LR
a[User Input Query] --> UI
UI --> GW
GW <==> HybridRAG-MegaService
EM ==> RET
RET ==> RER
RER ==> LLM
direction LR
T2C ==> LLM


%% Embedding service flow
direction LR
EM <-.-> TEI_EM
RET <-.-> R_RET
RER <-.-> TEI_RER
LLM <-.-> LLM_gen

direction TB
%% Vector DB interaction
R_RET <-.->|d|VDB
DP <-.->|d|VDB

direction TB
%% Graph DB interaction
T2C <-.->|d|GDB
S2G <-.->|d|GDB

```

## Deployment

[HybridRAG deployment on Intel Gaudi](./docker_compose/intel/hpu/gaudi/README.md)
3 changes: 3 additions & 0 deletions HybridRAG/README_NOTICE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Notice for FFmpeg:

FFmpeg is an open source project licensed under LGPL and GPL. See https://www.ffmpeg.org/legal.html. You are solely responsible for determining if your use of FFmpeg requires any additional licenses. Intel is not responsible for obtaining any such licenses, nor liable for any licensing fees due, in connection with your use of FFmpeg.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
160 changes: 160 additions & 0 deletions HybridRAG/docker_compose/intel/hpu/gaudi/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
# Example HybridRAG deployments on an Intel® Gaudi® Platform

This example covers the single-node on-premises deployment of the HybridRAG example using OPEA components. There are various ways to enable HybridRAG, but this example will focus on four options available for deploying the HybridRAG pipeline to Intel® Gaudi® AI Accelerators.

**Note** This example requires access to a properly installed Intel® Gaudi® platform with a functional Docker service configured to use the habanalabs-container-runtime. Please consult the [Intel® Gaudi® software Installation Guide](https://docs.habana.ai/en/v1.20.0/Installation_Guide/Driver_Installation.html) for more information.

## HybridRAG Quick Start Deployment

This section describes how to quickly deploy and test the HybridRAG service manually on an Intel® Gaudi® platform. The basic steps are:

### Access the Code

Clone the GenAIExample repository and access the HybridRAG Intel® Gaudi® platform Docker Compose files and supporting scripts:

```
git clone https://github.com/opea-project/GenAIExamples.git
cd GenAIExamples/HybridRAG/docker_compose/intel/hpu/gaudi/
```

Checkout a released version, such as v1.4:

```
git checkout v1.4
```

### Generate a HuggingFace Access Token

Some HuggingFace resources, such as some models, are only accessible if you have an access token. If you do not already have a HuggingFace access token, you can create one by first creating an account by following the steps provided at [HuggingFace](https://huggingface.co/) and then generating a [user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token).

### Configure the Deployment Environment

To set up environment variables for deploying HybridRAG services, source the _setup_env.sh_ script in this directory:

```
source ./set_env.sh
```

### Deploy the Services Using Docker Compose

To deploy the HybridRAG services, execute the `docker compose up` command with the appropriate arguments. For a default deployment, execute:

```bash
docker compose up -d
```

The HybridRAG docker images should automatically be downloaded from the `OPEA registry` and deployed on the Intel® Gaudi® Platform:

```
[+] Running 9/9
✔ Container redis-vector-db Healthy 6.4s
✔ Container vllm-service Started 0.4s
✔ Container tei-embedding-server Started 0.9s
✔ Container neo4j-apoc Healthy 11.4s
✔ Container tei-reranking-server Started 0.8s
✔ Container retriever-redis-server Started 1.0s
✔ Container dataprep-redis-server Started 6.5s
✔ Container text2cypher-gaudi-container Started 12.2s
✔ Container hybridrag-xeon-backend-server Started 12.4s
```

To rebuild the docker image for the hybridrag-xeon-backend-server container:

```
cd GenAIExamples/HybridRAG
docker build --no-cache -t opea/hybridrag:latest -f Dockerfile .
```

### Check the Deployment Status

After running docker compose, check if all the containers launched via docker compose have started:

```
docker ps -a
```

For the default deployment, the following 10 containers should have started:

```
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a9286abd0015 opea/hybridrag:latest "python hybridrag.py" 15 hours ago Up 15 hours 0.0.0.0:8888->8888/tcp, :::8888->8888/tcp hybridrag-xeon-backend-server
8477b154dc72 opea/text2cypher-gaudi:latest "/bin/sh -c 'bash ru…" 15 hours ago Up 15 hours 0.0.0.0:11801->9097/tcp, [::]:11801->9097/tcp text2cypher-gaudi-container
688e01a431fa opea/dataprep:latest "sh -c 'python $( [ …" 15 hours ago Up 15 hours 0.0.0.0:6007->5000/tcp, [::]:6007->5000/tcp dataprep-redis-server
54f574fe54bb opea/retriever:latest "python opea_retriev…" 15 hours ago Up 15 hours 0.0.0.0:7000->7000/tcp, :::7000->7000/tcp retriever-redis-server
5028eb66617c ghcr.io/huggingface/text-embeddings-inference:cpu-1.6 "text-embeddings-rou…" 15 hours ago Up 15 hours 0.0.0.0:8808->80/tcp, [::]:8808->80/tcp tei-reranking-server
a9dbf8a13365 opea/vllm:latest "python3 -m vllm.ent…" 15 hours ago Up 15 hours (healthy) 0.0.0.0:9009->80/tcp, [::]:9009->80/tcp vllm-service
43f44830f47b neo4j:latest "tini -g -- /startup…" 15 hours ago Up 15 hours (healthy) 0.0.0.0:7474->7474/tcp, :::7474->7474/tcp, 7473/tcp, 0.0.0.0:7687->7687/tcp, :::7687->7687/tcp neo4j-apoc
867feabb6f11 redis/redis-stack:7.2.0-v9 "/entrypoint.sh" 15 hours ago Up 15 hours (healthy) 0.0.0.0:6379->6379/tcp, :::6379->6379/tcp, 0.0.0.0:8001->8001/tcp, :::8001->8001/tcp redis-vector-db
23cd7f16453b ghcr.io/huggingface/text-embeddings-inference:cpu-1.6 "text-embeddings-rou…" 15 hours ago Up 15 hours 0.0.0.0:6006->80/tcp, [::]:6006->80/tcp tei-embedding-server
```

### Test the Pipeline

Once the HybridRAG services are running, run data ingestion. The following command is ingesting unstructure data:

```bash
cd GenAIExamples/HybridRAG/tests
curl -X POST -H "Content-Type: multipart/form-data" \
-F "files=@./Diabetes.txt" \
-F "files=@./Acne_Vulgaris.txt" \
-F "chunk_size=300" \
-F "chunk_overlap=20" \
http://${host_ip}:6007/v1/dataprep/ingest
```

By default, the application is pre-seeded with structured data and schema. To create knowledge graph with custom data and schema,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set the cypher_insert environment variable prior to application deployment. Below is an example:

```bash
export cypher_insert='
LOAD CSV WITH HEADERS FROM "https://docs.google.com/spreadsheets/d/e/2PACX-1vQCEUxVlMZwwI2sn2T1aulBrRzJYVpsM9no8AEsYOOklCDTljoUIBHItGnqmAez62wwLpbvKMr7YoHI/pub?gid=0&single=true&output=csv" AS rows
MERGE (d:disease {name:rows.Disease})
MERGE (dt:diet {name:rows.Diet})
MERGE (d)-[:HOME_REMEDY]->(dt)

MERGE (m:medication {name:rows.Medication})
MERGE (d)-[:TREATMENT]->(m)

MERGE (s:symptoms {name:rows.Symptom})
MERGE (d)-[:MANIFESTATION]->(s)

MERGE (p:precaution {name:rows.Precaution})
MERGE (d)-[:PREVENTION]->(p)
'
```

If the graph database is already populated, you can skip the knowledge graph generation by setting the refresh_db environment variable:

```bash
export refresh_db='False'
```

Now test the pipeline using the following command:

```bash
curl -s -X POST -d '{"messages": "what are the symptoms for Diabetes?"}' \
-H 'Content-Type: application/json' \
"${host_ip}:8888/v1/hybridrag"
```

To collect per request latency for the pipeline, run the following:

```bash
curl -o /dev/null -s -w "Total Time: %{time_total}s\n" \
-X POST \
-d '{"messages": "what are the symptoms for Diabetes?"}' \
-H 'Content-Type: application/json' \
"${host_ip}:8888/v1/hybridrag"
```

**Note** The value of _host_ip_ was set using the _set_env.sh_ script and can be found in the _.env_ file.

### Cleanup the Deployment

To stop the containers associated with the deployment, execute the following command:

```
docker compose -f compose.yaml down
```

All the HybridRAG containers will be stopped and then removed on completion of the "down" command.
Loading
Loading