Skip to content

Commit 6169ea4

Browse files
add new feature and bug fix for EC-RAG (#1324)
Signed-off-by: Zhu, Yongbo <yongbo.zhu@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent 75b0961 commit 6169ea4

27 files changed

+780
-116
lines changed

EdgeCraftRAG/Dockerfile

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,20 +7,28 @@ SHELL ["/bin/bash", "-o", "pipefail", "-c"]
77

88
RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \
99
libgl1-mesa-glx \
10-
libjemalloc-dev
10+
libjemalloc-dev \
11+
git
1112

1213
RUN useradd -m -s /bin/bash user && \
1314
mkdir -p /home/user && \
1415
chown -R user /home/user/
1516

16-
COPY ./requirements.txt /home/user/requirements.txt
1717
COPY ./chatqna.py /home/user/chatqna.py
1818

1919
WORKDIR /home/user
20-
RUN pip install --no-cache-dir -r requirements.txt
20+
RUN git clone https://github.com/opea-project/GenAIComps.git
21+
22+
WORKDIR /home/user/GenAIComps
23+
RUN pip install --no-cache-dir --upgrade pip setuptools && \
24+
pip install --no-cache-dir -r /home/user/GenAIComps/requirements.txt
25+
26+
ENV PYTHONPATH=$PYTHONPATH:/home/user/GenAIComps
2127

2228
USER user
2329

30+
WORKDIR /home/user
31+
2432
RUN echo 'ulimit -S -n 999999' >> ~/.bashrc
2533

2634
ENTRYPOINT ["python", "chatqna.py"]

EdgeCraftRAG/Dockerfile.server

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,11 @@ SHELL ["/bin/bash", "-o", "pipefail", "-c"]
44

55
RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \
66
libgl1-mesa-glx \
7-
libjemalloc-dev
7+
libjemalloc-dev \
8+
libmagic1 \
9+
libglib2.0-0 \
10+
poppler-utils \
11+
tesseract-ocr
812

913
RUN apt-get update && apt-get install -y gnupg wget
1014
RUN wget -qO - https://repositories.intel.com/gpu/intel-graphics.key | \

EdgeCraftRAG/README.md

Lines changed: 33 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,14 @@ Retrieval-Augmented Generation system for edge solutions. It is designed to
55
curate the RAG pipeline to meet hardware requirements at edge with guaranteed
66
quality and performance.
77

8+
## What's New in this release?
9+
10+
- Support image/url data retrieval and display in EC-RAG
11+
- Support display of LLM-used context sources in UI
12+
- Support pipeline remove operation in RESTful API and UI
13+
- Support RAG pipeline performance benchmark and display in UI
14+
- Fixed known issues in EC-RAG UI and server
15+
816
## Quick Start Guide
917

1018
### (Optional) Build Docker Images for Mega Service, Server and UI by your own
@@ -43,6 +51,8 @@ export GRADIO_PATH="your gradio cache path for transferring files"
4351

4452
# Make sure all 3 folders have 1000:1000 permission, otherwise
4553
# chown 1000:1000 ${MODEL_PATH} ${DOC_PATH} ${GRADIO_PATH}
54+
# In addition, also make sure the .cache folder has 1000:1000 permission, otherwise
55+
# chown 1000:1000 $HOME/.cache
4656

4757
# Use `ip a` to check your active ip
4858
export HOST_IP="your host ip"
@@ -192,7 +202,7 @@ curl -X POST http://${HOST_IP}:16010/v1/settings/pipelines -H "Content-Type: app
192202
#### Update a pipeline
193203

194204
```bash
195-
curl -X PATCH http://${HOST_IP}:16010/v1/settings/pipelines -H "Content-Type: application/json" -d @tests/test_pipeline_local_llm.json | jq '.'
205+
curl -X PATCH http://${HOST_IP}:16010/v1/settings/pipelines/rag_test_local_llm -H "Content-Type: application/json" -d @tests/test_pipeline_local_llm.json | jq '.'
196206
```
197207

198208
#### Check all pipelines
@@ -204,15 +214,34 @@ curl -X GET http://${HOST_IP}:16010/v1/settings/pipelines -H "Content-Type: appl
204214
#### Activate a pipeline
205215

206216
```bash
207-
curl -X PATCH http://${HOST_IP}:16010/v1/settings/pipelines/test1 -H "Content-Type: application/json" -d '{"active": "true"}' | jq '.'
217+
curl -X PATCH http://${HOST_IP}:16010/v1/settings/pipelines/rag_test_local_llm -H "Content-Type: application/json" -d '{"active": "true"}' | jq '.'
218+
```
219+
220+
#### Remove a pipeline
221+
222+
```bash
223+
# Firstly, deactivate the pipeline if the pipeline status is active
224+
curl -X PATCH http://${HOST_IP}:16010/v1/settings/pipelines/rag_test_local_llm -H "Content-Type: application/json" -d '{"active": "false"}' | jq '.'
225+
# Then delete the pipeline
226+
curl -X DELETE http://${HOST_IP}:16010/v1/settings/pipelines/rag_test_local_llm -H "Content-Type: application/json" | jq '.'
227+
```
228+
229+
#### Enable and check benchmark for pipelines
230+
231+
```bash
232+
# Set ENABLE_BENCHMARK as true before launch services
233+
export ENABLE_BENCHMARK="true"
234+
235+
# check the benchmark data for pipeline {pipeline_name}
236+
curl -X GET http://${HOST_IP}:16010/v1/settings/pipelines/{pipeline_name}/benchmark -H "Content-Type: application/json" | jq '.'
208237
```
209238

210239
### Model Management
211240

212241
#### Load a model
213242

214243
```bash
215-
curl -X POST http://${HOST_IP}:16010/v1/settings/models -H "Content-Type: application/json" -d '{"model_type": "reranker", "model_id": "BAAI/bge-reranker-large", "model_path": "./models/bge_ov_reranker", "device": "cpu"}' | jq '.'
244+
curl -X POST http://${HOST_IP}:16010/v1/settings/models -H "Content-Type: application/json" -d '{"model_type": "reranker", "model_id": "BAAI/bge-reranker-large", "model_path": "./models/bge_ov_reranker", "device": "cpu", "weight": "INT4"}' | jq '.'
216245
```
217246

218247
It will take some time to load the model.
@@ -226,7 +255,7 @@ curl -X GET http://${HOST_IP}:16010/v1/settings/models -H "Content-Type: applica
226255
#### Update a model
227256

228257
```bash
229-
curl -X PATCH http://${HOST_IP}:16010/v1/settings/models/BAAI/bge-reranker-large -H "Content-Type: application/json" -d '{"model_type": "reranker", "model_id": "BAAI/bge-reranker-large", "model_path": "./models/bge_ov_reranker", "device": "gpu"}' | jq '.'
258+
curl -X PATCH http://${HOST_IP}:16010/v1/settings/models/BAAI/bge-reranker-large -H "Content-Type: application/json" -d '{"model_type": "reranker", "model_id": "BAAI/bge-reranker-large", "model_path": "./models/bge_ov_reranker", "device": "gpu", "weight": "INT4"}' | jq '.'
230259
```
231260

232261
#### Check a certain model

EdgeCraftRAG/docker_compose/intel/gpu/arc/compose.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ services:
1111
https_proxy: ${https_proxy}
1212
HF_ENDPOINT: ${HF_ENDPOINT}
1313
vLLM_ENDPOINT: ${vLLM_ENDPOINT}
14+
ENABLE_BENCHMARK: ${ENABLE_BENCHMARK:-false}
1415
volumes:
1516
- ${MODEL_PATH:-${PWD}}:/home/user/models
1617
- ${DOC_PATH:-${PWD}}:/home/user/docs

EdgeCraftRAG/docker_compose/intel/gpu/arc/compose_vllm.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ services:
1111
https_proxy: ${https_proxy}
1212
HF_ENDPOINT: ${HF_ENDPOINT}
1313
vLLM_ENDPOINT: ${vLLM_ENDPOINT}
14+
ENABLE_BENCHMARK: ${ENABLE_BENCHMARK:-false}
1415
volumes:
1516
- ${MODEL_PATH:-${PWD}}:/home/user/models
1617
- ${DOC_PATH:-${PWD}}:/home/user/docs

EdgeCraftRAG/edgecraftrag/api/v1/chatqna.py

Lines changed: 28 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,12 @@
11
# Copyright (C) 2024 Intel Corporation
22
# SPDX-License-Identifier: Apache-2.0
33

4+
from comps import GeneratedDoc
45
from comps.cores.proto.api_protocol import ChatCompletionRequest
6+
from edgecraftrag.api_schema import RagOut
57
from edgecraftrag.context import ctx
68
from fastapi import FastAPI
9+
from fastapi.responses import StreamingResponse
710

811
chatqna_app = FastAPI()
912

@@ -25,8 +28,31 @@ async def retrieval(request: ChatCompletionRequest):
2528
# ChatQnA
2629
@chatqna_app.post(path="/v1/chatqna")
2730
async def chatqna(request: ChatCompletionRequest):
31+
generator = ctx.get_pipeline_mgr().get_active_pipeline().generator
32+
if generator:
33+
request.model = generator.model_id
2834
if request.stream:
29-
return ctx.get_pipeline_mgr().run_pipeline(chat_request=request)
35+
ret, retri_res = ctx.get_pipeline_mgr().run_pipeline(chat_request=request)
36+
return ret
3037
else:
31-
ret = ctx.get_pipeline_mgr().run_pipeline(chat_request=request)
38+
ret, retri_res = ctx.get_pipeline_mgr().run_pipeline(chat_request=request)
3239
return str(ret)
40+
41+
42+
# RAGQnA
43+
@chatqna_app.post(path="/v1/ragqna")
44+
async def ragqna(request: ChatCompletionRequest):
45+
res, retri_res = ctx.get_pipeline_mgr().run_pipeline(chat_request=request)
46+
if isinstance(res, GeneratedDoc):
47+
res = res.text
48+
elif isinstance(res, StreamingResponse):
49+
collected_data = []
50+
async for chunk in res.body_iterator:
51+
collected_data.append(chunk)
52+
res = "".join(collected_data)
53+
54+
ragout = RagOut(query=request.messages, contexts=[], response=str(res))
55+
for n in retri_res:
56+
origin_text = n.node.get_text()
57+
ragout.contexts.append(origin_text.strip())
58+
return ragout

EdgeCraftRAG/edgecraftrag/api/v1/data.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ async def delete_file(name):
6565
# TODO: delete the nodes related to the file
6666
all_docs = ctx.get_file_mgr().get_all_docs()
6767

68-
nodelist = ctx.get_pipeline_mgr().run_data_prepare(docs=all_docs)
68+
nodelist = ctx.get_pipeline_mgr().run_data_update(docs=all_docs)
6969
if nodelist is None:
7070
return "Error"
7171
pl = ctx.get_pipeline_mgr().get_active_pipeline()
@@ -91,7 +91,7 @@ async def update_file(name, request: DataIn):
9191
# 3. Re-run the pipeline
9292
# TODO: update the nodes related to the file
9393
all_docs = ctx.get_file_mgr().get_all_docs()
94-
nodelist = ctx.get_pipeline_mgr().run_data_prepare(docs=all_docs)
94+
nodelist = ctx.get_pipeline_mgr().run_data_update(docs=all_docs)
9595
if nodelist is None:
9696
return "Error"
9797
pl = ctx.get_pipeline_mgr().get_active_pipeline()

EdgeCraftRAG/edgecraftrag/api/v1/pipeline.py

Lines changed: 26 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,15 @@
55

66
from edgecraftrag.api_schema import PipelineCreateIn
77
from edgecraftrag.base import IndexerType, InferenceType, ModelType, NodeParserType, PostProcessorType, RetrieverType
8+
from edgecraftrag.components.benchmark import Benchmark
89
from edgecraftrag.components.generator import QnAGenerator
910
from edgecraftrag.components.indexer import VectorIndexer
10-
from edgecraftrag.components.node_parser import HierarchyNodeParser, SimpleNodeParser, SWindowNodeParser
11+
from edgecraftrag.components.node_parser import (
12+
HierarchyNodeParser,
13+
SimpleNodeParser,
14+
SWindowNodeParser,
15+
UnstructedNodeParser,
16+
)
1117
from edgecraftrag.components.postprocessor import MetadataReplaceProcessor, RerankProcessor
1218
from edgecraftrag.components.retriever import AutoMergeRetriever, SimpleBM25Retriever, VectorSimRetriever
1319
from edgecraftrag.context import ctx
@@ -28,6 +34,14 @@ async def get_pipeline(name):
2834
return ctx.get_pipeline_mgr().get_pipeline_by_name_or_id(name)
2935

3036

37+
# GET Pipeline benchmark
38+
@pipeline_app.get(path="/v1/settings/pipelines/{name}/benchmark")
39+
async def get_pipeline_benchmark(name):
40+
pl = ctx.get_pipeline_mgr().get_pipeline_by_name_or_id(name)
41+
if pl and pl.benchmark:
42+
return pl.benchmark
43+
44+
3145
# POST Pipeline
3246
@pipeline_app.post(path="/v1/settings/pipelines")
3347
async def add_pipeline(request: PipelineCreateIn):
@@ -49,7 +63,7 @@ async def add_pipeline(request: PipelineCreateIn):
4963
async def update_pipeline(name, request: PipelineCreateIn):
5064
pl = ctx.get_pipeline_mgr().get_pipeline_by_name_or_id(name)
5165
if pl is None:
52-
return None
66+
return "Pipeline not exists"
5367
active_pl = ctx.get_pipeline_mgr().get_active_pipeline()
5468
if pl == active_pl:
5569
if not request.active:
@@ -61,6 +75,12 @@ async def update_pipeline(name, request: PipelineCreateIn):
6175
return pl
6276

6377

78+
# REMOVE Pipeline
79+
@pipeline_app.delete(path="/v1/settings/pipelines/{name}")
80+
async def remove_pipeline(name):
81+
return ctx.get_pipeline_mgr().remove_pipeline_by_name_or_id(name)
82+
83+
6484
def update_pipeline_handler(pl, req):
6585
if req.node_parser is not None:
6686
np = req.node_parser
@@ -86,6 +106,8 @@ def update_pipeline_handler(pl, req):
86106
)
87107
case NodeParserType.SENTENCEWINDOW:
88108
pl.node_parser = SWindowNodeParser.from_defaults(window_size=np.window_size)
109+
case NodeParserType.UNSTRUCTURED:
110+
pl.node_parser = UnstructedNodeParser(chunk_size=np.chunk_size, chunk_overlap=np.chunk_overlap)
89111
ctx.get_node_parser_mgr().add(pl.node_parser)
90112

91113
if req.indexer is not None:
@@ -169,6 +191,8 @@ def update_pipeline_handler(pl, req):
169191
# Use weakref to achieve model deletion and memory release
170192
model_ref = weakref.ref(model)
171193
pl.generator = QnAGenerator(model_ref, gen.prompt_path, gen.inference_type)
194+
195+
pl.benchmark = Benchmark(pl.enable_benchmark, gen.inference_type)
172196
else:
173197
return "Inference Type Not Supported"
174198

EdgeCraftRAG/edgecraftrag/api_schema.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,3 +61,9 @@ class DataIn(BaseModel):
6161

6262
class FilesIn(BaseModel):
6363
local_paths: Optional[list[str]] = None
64+
65+
66+
class RagOut(BaseModel):
67+
query: str
68+
contexts: Optional[list[str]] = None
69+
response: str

EdgeCraftRAG/edgecraftrag/base.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ class NodeParserType(str, Enum):
4444
SIMPLE = "simple"
4545
HIERARCHY = "hierarchical"
4646
SENTENCEWINDOW = "sentencewindow"
47+
UNSTRUCTURED = "unstructured"
4748

4849

4950
class IndexerType(str, Enum):
@@ -81,6 +82,7 @@ class InferenceType(str, Enum):
8182
class CallbackType(str, Enum):
8283

8384
DATAPREP = "dataprep"
85+
DATAUPDATE = "dataupdate"
8486
RETRIEVE = "retrieve"
8587
PIPELINE = "pipeline"
8688

0 commit comments

Comments
 (0)