Skip to content

Commit 142005f

Browse files
SLasyaNlasyasnaMahanna
authored
ArangoDB Retriever (#3)
* Refactored Retreiver for Arango added: arango.py,README_arango.md modified: config.py, requirements.txt,compose.yaml * refactored retriever * commit after PR review * Update comps/retrievers/src/README_arango.md * Update comps/retrievers/src/integrations/arangodb.py * Update comps/retrievers/src/requirements.txt * update: docs * misc: fix comment * add comment * cleanup: `arangodb.py` format with black/isort, use env vars, remove custom `graph_name` functionality (no longer needed) * rm: comment --------- Co-authored-by: lasyasn <lasyan640@gmail.com> Co-authored-by: Anthony Mahanna <43019056+aMahanna@users.noreply.github.com> Co-authored-by: Anthony Mahanna <anthony.mahanna@arangodb.com>
1 parent f1c8602 commit 142005f

File tree

8 files changed

+570
-2
lines changed

8 files changed

+570
-2
lines changed

comps/retrievers/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,3 +41,7 @@ For details, please refer to this [readme](src/README_neo4j.md)
4141
## Retriever Microservice with Pathway
4242

4343
For details, please refer to this [readme](src/README_pathway.md)
44+
45+
## Retriever Microservice with ArangoDB
46+
47+
For details, please refer to this [readme](src/README_arangodb.md)

comps/retrievers/deployment/docker_compose/compose.yaml

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ include:
1212
- ../../../third_parties/tei/deployment/docker_compose/compose.yaml
1313
- ../../../third_parties/tgi/deployment/docker_compose/compose.yaml
1414
- ../../../third_parties/vdms/deployment/docker_compose/compose.yaml
15+
- ../../../third_parties/arango/deployment/docker_compose/compose.yaml
1516

1617
services:
1718
retriever:
@@ -170,6 +171,27 @@ services:
170171
condition: service_healthy
171172
tei-embedding-serving:
172173
condition: service_healthy
174+
175+
retriever-arango:
176+
extends: retriever
177+
container_name: retriever-arango
178+
depends_on:
179+
- arango-vector-db
180+
- tei-embedding-service
181+
environment:
182+
RETRIEVER_COMPONENT_NAME: ${RETRIEVER_COMPONENT_NAME:-OPEA_RETRIEVER_ARANGO}
183+
no_proxy: ${no_proxy}
184+
http_proxy: ${http_proxy}
185+
https_proxy: ${https_proxy}
186+
ARANGO_URL: ${ARANGO_URL}
187+
ARANGO_USERNAME: ${ARANGO_USERNAME}
188+
ARANGO_PASSWORD: ${ARANGO_PASSWORD}
189+
ARANGO_DB_NAME: ${ARANGO_DB_NAME}
190+
ARANGO_GRAPH_NAME: ${ARANGO_GRAPH_NAME}
191+
TEI_EMBED_MODEL: ${TEI_EMBED_MODEL}
192+
TEI_EMBEDDING_ENDPOINT: ${TEI_EMBEDDING_ENDPOINT}
193+
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
194+
OPENAI_API_KEY: ${OPENAI_API_KEY}
173195

174196

175197
networks:

comps/retrievers/src/Dockerfile

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,8 @@ RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missin
1010
libjemalloc-dev \
1111
libcairo2 \
1212
libglib2.0-0 \
13-
vim
13+
vim \
14+
git
1415

1516
RUN useradd -m -s /bin/bash user && \
1617
mkdir -p /home/user && \
Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# Retriever Microservice with ArangoDB (work-in-progress)
2+
3+
## 🚀Start Microservice with Python
4+
5+
### Install Requirements
6+
7+
```bash
8+
pip install -r requirements.txt
9+
```
10+
11+
### Start ArangoDB Server
12+
13+
To launch ArangoDB locally, first ensure you have docker installed. Then, you can launch the database with the following docker command.
14+
15+
16+
```bash
17+
docker run -d --name arango-vector-db -p 8529:8529 -e ARANGO_ROOT_PASSWORD=password arangodb/arangodb:3.12.4 --experimental-vector-index=true
18+
```
19+
20+
### Setup Environment Variables
21+
22+
```bash
23+
export no_proxy=${your_no_proxy}
24+
export http_proxy=${your_http_proxy}
25+
export https_proxy=${your_http_proxy}
26+
export ARANGO_URL=${your_arangodb_uri}
27+
export ARANGODB_USERNAME=${your_arangodb_username}
28+
export ARANGODB_PASSWORD=${your_arangodb_password}
29+
export ARANGO_DB_NAME=${your_arangodb_database}
30+
31+
```
32+
33+
34+
## 🚀Start Microservice with Docker
35+
36+
### Build Docker Image
37+
38+
```bash
39+
cd ../../
40+
docker build -t opea/retriever:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/retrievers/src/Dockerfile .
41+
```
42+
43+
### Run Docker with CLI
44+
45+
```bash
46+
docker run -d --name="retriever-arango-server" -p 7000:7000 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e ARANGODB_URL="http://localhost:8529" opea/retriever:latest -e RETRIEVER_COMPONENT_NAME="OPEA_RETRIEVER_ARANGO"
47+
```
48+
49+
50+
51+
## 🚀3. Consume Retriever Service
52+
53+
### 3.1 Check Service Status
54+
55+
```bash
56+
curl http://${your_ip}:7000/v1/health_check \
57+
-X GET \
58+
-H 'Content-Type: application/json'
59+
```
60+
61+
### 3.2 Consume Embedding Service
62+
63+
To consume the Retriever Microservice, you can generate a mock embedding vector of length 768 with Python.
64+
65+
```bash
66+
export your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
67+
curl http://${your_ip}:7000/v1/retrieval \
68+
-X POST \
69+
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding}}" \
70+
-H 'Content-Type: application/json'
71+
```
72+
73+
```bash
74+
export your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
75+
curl http://localhost:7000/v1/retrieval \
76+
-X POST \
77+
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"similarity\", \"k\":4}" \
78+
-H 'Content-Type: application/json'
79+
```
80+
81+
82+
83+
```bash
84+
export your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
85+
curl http://localhost:7000/v1/retrieval \
86+
-X POST \
87+
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"similarity_distance_threshold\", \"k\":4, \"distance_threshold\":1.0}" \
88+
-H 'Content-Type: application/json'
89+
```
90+
91+
92+
93+
```bash
94+
export your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
95+
curl http://localhost:7000/v1/retrieval \
96+
-X POST \
97+
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"similarity_score_threshold\", \"k\":4, \"score_threshold\":0.2}" \
98+
-H 'Content-Type: application/json'
99+
```
100+
101+
102+
```bash
103+
export your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
104+
curl http://localhost:7000/v1/retrieval \
105+
-X POST \
106+
-d "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"mmr\", \"k\":4, \"fetch_k\":20, \"lambda_mult\":0.5}" \
107+
-H 'Content-Type: application/json'
108+
```
109+
110+

0 commit comments

Comments
 (0)