opea-project · jeanyu-habana · May 9, 2025 · May 13, 2025 · May 14, 2025 · May 14, 2025
@@ -0,0 +1,10 @@
+# Copyright (C) 2025 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+ARG BASE_TAG=latest
+FROM opea/comps-base:$BASE_TAG
+
+COPY ./hybridrag.py $HOME/hybridrag.py
+
+ENTRYPOINT ["python", "hybridrag.py"]
+
@@ -0,0 +1,98 @@
+# HybridRAG Application
+
+Enterprise AI systems require solutions that handle both structured data (databases, transactions, CSVs, JSON) and unstructured data (documents, images, audio). While traditional VectorRAG excels at semantic search across documents, it struggles with complex queries requiring global context or relationship-aware reasoning. HybridRAG application addresses these gaps by combining GraphRAG (knowledge graph-based retrieval) and VectorRAG (vector database retrieval) for enhanced accuracy and contextual relevance.
+
+## Table of contents
+
+1. [Architecture](#architecture)
+2. [Deployment](#deployment)
+
+## Architecture
+
+The HybridRAG application is a customizable end-to-end workflow that leverages the capabilities of LLMs and RAG efficiently. HybridRAG architecture is shown below:
+
+![architecture](./assets/img/hybridrag_retriever_architecture.png)
+
+This application is modular as it leverages each component as a microservice(as defined in [GenAIComps](https://github.com/opea-project/GenAIComps)) that can scale independently. It comprises data preparation, embedding, retrieval, reranker(optional) and LLM microservices. All these microservices are stitched together by the HybridRAG megaservice that orchestrates the data through these microservices. The flow chart below shows the information flow between different microservices for this example.
+
+```mermaid
+---
+config:
+  flowchart:
+    nodeSpacing: 400
+    rankSpacing: 100
+    curve: linear
+  themeVariables:
+    fontSize: 50px
+---
+flowchart LR
+    %% Colors %%
+    classDef blue fill:#ADD8E6,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
+    classDef orange fill:#FBAA60,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
+    classDef orchid fill:#C26DBC,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
+    classDef invisible fill:transparent,stroke:transparent;
+    style HybridRAG-MegaService stroke:#000000
+
+    %% Subgraphs %%
+    subgraph HybridRAG-MegaService["HybridRAG MegaService "]
+        direction LR
+        EM([Embedding MicroService]):::blue
+        RET([Retrieval MicroService]):::blue
+        RER([Rerank MicroService]):::blue
+        LLM([LLM MicroService]):::blue
+        direction LR
+        T2C([Text2Cypher MicroService]):::blue
+        LLM([LLM MicroService]):::blue
+    end
+    subgraph UserInterface[" User Interface "]
+        direction LR
+        a([User Input Query]):::orchid
+        UI([UI server<br>]):::orchid
+    end
+
+
+
+    TEI_RER{{Reranking service<br>}}
+    TEI_EM{{Embedding service <br>}}
+    VDB{{Vector DB<br><br>}}
+    GDB{{Graph DB<br><br>}}
+    R_RET{{Retriever service <br>}}
+    DP([Data Preparation MicroService]):::blue
+    S2G([Struct2Graph MicroService]):::blue
+    LLM_gen{{LLM Service <br>}}
+    GW([HybridRAG GateWay<br>]):::orange
+
+    %% Questions interaction
+    direction LR
+    a[User Input Query] --> UI
+    UI --> GW
+    GW <==> HybridRAG-MegaService
+    EM ==> RET
+    RET ==> RER
+    RER ==> LLM
+    direction LR
+    T2C ==> LLM
+
+
+    %% Embedding service flow
+    direction LR
+    EM <-.-> TEI_EM
+    RET <-.-> R_RET
+    RER <-.-> TEI_RER
+    LLM <-.-> LLM_gen
+
+    direction TB
+    %% Vector DB interaction
+    R_RET <-.->|d|VDB
+    DP <-.->|d|VDB
+
+    direction TB
+    %% Graph DB interaction
+    T2C <-.->|d|GDB
+    S2G <-.->|d|GDB
+
+```
+
+## Deployment
+
+[HybridRAG deployment on Intel Gaudi](./docker_compose/intel/hpu/gaudi/README.md)
@@ -0,0 +1,3 @@
+# Notice for FFmpeg:
+
+FFmpeg is an open source project licensed under LGPL and GPL. See https://www.ffmpeg.org/legal.html. You are solely responsible for determining if your use of FFmpeg requires any additional licenses. Intel is not responsible for obtaining any such licenses, nor liable for any licensing fees due, in connection with your use of FFmpeg.
@@ -0,0 +1,160 @@
+# Example HybridRAG deployments on an Intel® Gaudi® Platform
+
+This example covers the single-node on-premises deployment of the HybridRAG example using OPEA components. There are various ways to enable HybridRAG, but this example will focus on four options available for deploying the HybridRAG pipeline to Intel® Gaudi® AI Accelerators.
+
+**Note** This example requires access to a properly installed Intel® Gaudi® platform with a functional Docker service configured to use the habanalabs-container-runtime. Please consult the [Intel® Gaudi® software Installation Guide](https://docs.habana.ai/en/v1.20.0/Installation_Guide/Driver_Installation.html) for more information.
+
+## HybridRAG Quick Start Deployment
+
+This section describes how to quickly deploy and test the HybridRAG service manually on an Intel® Gaudi® platform. The basic steps are:
+
+### Access the Code
+
+Clone the GenAIExample repository and access the HybridRAG Intel® Gaudi® platform Docker Compose files and supporting scripts:
+
+```
+git clone https://github.com/opea-project/GenAIExamples.git
+cd GenAIExamples/HybridRAG/docker_compose/intel/hpu/gaudi/
+```
+
+Checkout a released version, such as v1.4:
+
+```
+git checkout v1.4
+```
+
+### Generate a HuggingFace Access Token
+
+Some HuggingFace resources, such as some models, are only accessible if you have an access token. If you do not already have a HuggingFace access token, you can create one by first creating an account by following the steps provided at [HuggingFace](https://huggingface.co/) and then generating a [user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token).
+
+### Configure the Deployment Environment
+
+To set up environment variables for deploying HybridRAG services, source the _setup_env.sh_ script in this directory:
+
+```
+source ./set_env.sh
+```
+
+### Deploy the Services Using Docker Compose
+
+To deploy the HybridRAG services, execute the `docker compose up` command with the appropriate arguments. For a default deployment, execute:
+
+```bash
+docker compose up -d
+```
+
+The HybridRAG docker images should automatically be downloaded from the `OPEA registry` and deployed on the Intel® Gaudi® Platform:
+
+```
+[+] Running 9/9
+ ✔ Container redis-vector-db                Healthy                                                                           6.4s
+ ✔ Container vllm-service                   Started                                                                           0.4s
+ ✔ Container tei-embedding-server           Started                                                                           0.9s
+ ✔ Container neo4j-apoc                     Healthy                                                                          11.4s
+ ✔ Container tei-reranking-server           Started                                                                           0.8s
+ ✔ Container retriever-redis-server         Started                                                                           1.0s
+ ✔ Container dataprep-redis-server          Started                                                                           6.5s
+ ✔ Container text2cypher-gaudi-container    Started                                                                          12.2s
+ ✔ Container hybridrag-xeon-backend-server  Started                                                                          12.4s
+```
+
+To rebuild the docker image for the hybridrag-xeon-backend-server container:
+
+```
+cd GenAIExamples/HybridRAG
+docker build --no-cache -t opea/hybridrag:latest -f Dockerfile .
+```
+
+### Check the Deployment Status
+
+After running docker compose, check if all the containers launched via docker compose have started:
+
+```
+docker ps -a
+```
+
+For the default deployment, the following 10 containers should have started:
+
+```
+CONTAINER ID   IMAGE                                                                                       COMMAND                  CREATED        STATUS                  PORTS                                                                                            NAMES
+a9286abd0015   opea/hybridrag:latest                                                                       "python hybridrag.py"    15 hours ago   Up 15 hours             0.0.0.0:8888->8888/tcp, :::8888->8888/tcp                                                        hybridrag-xeon-backend-server
+8477b154dc72   opea/text2cypher-gaudi:latest                                                               "/bin/sh -c 'bash ru…"   15 hours ago   Up 15 hours             0.0.0.0:11801->9097/tcp, [::]:11801->9097/tcp                                                    text2cypher-gaudi-container
+688e01a431fa   opea/dataprep:latest                                                                        "sh -c 'python $( [ …"   15 hours ago   Up 15 hours             0.0.0.0:6007->5000/tcp, [::]:6007->5000/tcp                                                      dataprep-redis-server
+54f574fe54bb   opea/retriever:latest                                                                       "python opea_retriev…"   15 hours ago   Up 15 hours             0.0.0.0:7000->7000/tcp, :::7000->7000/tcp                                                        retriever-redis-server
+5028eb66617c   ghcr.io/huggingface/text-embeddings-inference:cpu-1.6                                       "text-embeddings-rou…"   15 hours ago   Up 15 hours             0.0.0.0:8808->80/tcp, [::]:8808->80/tcp                                                          tei-reranking-server
+a9dbf8a13365   opea/vllm:latest                                                                            "python3 -m vllm.ent…"   15 hours ago   Up 15 hours (healthy)   0.0.0.0:9009->80/tcp, [::]:9009->80/tcp                                                          vllm-service
+43f44830f47b   neo4j:latest                                                                                "tini -g -- /startup…"   15 hours ago   Up 15 hours (healthy)   0.0.0.0:7474->7474/tcp, :::7474->7474/tcp, 7473/tcp, 0.0.0.0:7687->7687/tcp, :::7687->7687/tcp   neo4j-apoc
+867feabb6f11   redis/redis-stack:7.2.0-v9                                                                  "/entrypoint.sh"         15 hours ago   Up 15 hours (healthy)   0.0.0.0:6379->6379/tcp, :::6379->6379/tcp, 0.0.0.0:8001->8001/tcp, :::8001->8001/tcp             redis-vector-db
+23cd7f16453b   ghcr.io/huggingface/text-embeddings-inference:cpu-1.6                                       "text-embeddings-rou…"   15 hours ago   Up 15 hours             0.0.0.0:6006->80/tcp, [::]:6006->80/tcp                                                          tei-embedding-server
+```
+
+### Test the Pipeline
+
+Once the HybridRAG services are running, run data ingestion. The following command is ingesting unstructure data:
+
+```bash
+cd GenAIExamples/HybridRAG/tests
+curl -X POST -H "Content-Type: multipart/form-data" \
+    -F "files=@./Diabetes.txt" \
+    -F "files=@./Acne_Vulgaris.txt" \
+    -F "chunk_size=300" \
+    -F "chunk_overlap=20" \
+    http://${host_ip}:6007/v1/dataprep/ingest
+```
+
+By default, the application is pre-seeded with structured data and schema. To create knowledge graph with custom data and schema,
+set the cypher_insert environment variable prior to application deployment. Below is an example:
+
+```bash
+export cypher_insert='
+ LOAD CSV WITH HEADERS FROM "https://docs.google.com/spreadsheets/d/e/2PACX-1vQCEUxVlMZwwI2sn2T1aulBrRzJYVpsM9no8AEsYOOklCDTljoUIBHItGnqmAez62wwLpbvKMr7YoHI/pub?gid=0&single=true&output=csv" AS rows
+ MERGE (d:disease {name:rows.Disease})
+ MERGE (dt:diet {name:rows.Diet})
+ MERGE (d)-[:HOME_REMEDY]->(dt)
+
+ MERGE (m:medication {name:rows.Medication})
+ MERGE (d)-[:TREATMENT]->(m)
+
+ MERGE (s:symptoms {name:rows.Symptom})
+ MERGE (d)-[:MANIFESTATION]->(s)
+
+ MERGE (p:precaution {name:rows.Precaution})
+ MERGE (d)-[:PREVENTION]->(p)
+'
+```
+
+If the graph database is already populated, you can skip the knowledge graph generation by setting the refresh_db environment variable:
+
+```bash
+export refresh_db='False'
+```
+
+Now test the pipeline using the following command:
+
+```bash
+curl -s -X POST -d '{"messages": "what are the symptoms for Diabetes?"}' \
+    -H 'Content-Type: application/json' \
+    "${host_ip}:8888/v1/hybridrag"
+```
+
+To collect per request latency for the pipeline, run the following:
+
+```bash
+curl -o /dev/null -s -w "Total Time: %{time_total}s\n" \
+    -X POST \
+    -d '{"messages": "what are the symptoms for Diabetes?"}' \
+    -H 'Content-Type: application/json' \
+    "${host_ip}:8888/v1/hybridrag"
+```
+
+**Note** The value of _host_ip_ was set using the _set_env.sh_ script and can be found in the _.env_ file.
+
+### Cleanup the Deployment
+
+To stop the containers associated with the deployment, execute the following command:
+
+```
+docker compose -f compose.yaml down
+```
+
+All the HybridRAG containers will be stopped and then removed on completion of the "down" command.
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		# Notice for FFmpeg:

		FFmpeg is an open source project licensed under LGPL and GPL. See https://www.ffmpeg.org/legal.html. You are solely responsible for determining if your use of FFmpeg requires any additional licenses. Intel is not responsible for obtaining any such licenses, nor liable for any licensing fees due, in connection with your use of FFmpeg.