Skip to content

Commit 14f509f

Browse files
committed
Squashed commit of the following:
commit ad8f517 Author: Dina Suehiro Jones <dina.s.jones@intel.com> Date: Wed Feb 26 11:35:04 2025 -0800 Dataprep Multimodal Redis README fixes (opea-project#1330) Signed-off-by: Dina Suehiro Jones <dina.s.jones@intel.com> commit c70f868 Author: Ervin Castelino <89144265+Ervin0307@users.noreply.github.com> Date: Tue Feb 25 07:42:42 2025 +0000 Update README.md (opea-project#1253) Signed-off-by: Ervin <ervincas13@gmail.com> commit 589587a Author: ZePan110 <ze.pan@intel.com> Date: Mon Feb 24 17:54:51 2025 +0800 Fix docker image security issues (opea-project#1321) Signed-off-by: ZePan110 <ze.pan@intel.com> commit f5699e4 Author: Brijesh Thummar <brijeshthummar02@gmail.com> Date: Mon Feb 24 12:00:19 2025 +0530 [Doc] vLLM - typo in README.md (opea-project#1302) Fix Typo in README Signed-off-by: brijeshthummar02@gmail.com <brijeshthummar02@gmail.com> commit 364ccad Author: Jonathan Minkin <minkinj@amazon.com> Date: Sun Feb 23 19:27:31 2025 -0800 Add support for string message in Bedrock textgen (opea-project#1291) * Add support for string message in bedrock, update README * Add test for string message in test script Signed-off-by: Jonathan Minkin <minkinj@amazon.com> commit 625aec9 Author: Daniel De León <111013930+daniel-de-leon-user293@users.noreply.github.com> Date: Fri Feb 21 13:20:58 2025 -0800 Add native support for toxicity detection guardrail microservice (opea-project#1258) * add opea native support for toxic-prompt-roberta * add test script back * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comp name env variable * set default port to 9090 Signed-off-by: Daniel Deleon <daniel.de.leon@intel.com> * add service to compose Signed-off-by: Daniel Deleon <daniel.de.leon@intel.com> * removed debug print Signed-off-by: Daniel Deleon <daniel.de.leon@intel.com> * remove triton version because habana updated Signed-off-by: Daniel Deleon <daniel.de.leon@intel.com> * add locust results Signed-off-by: Daniel Deleon <daniel.de.leon@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * skip warmup for halluc test Signed-off-by: Daniel Deleon <daniel.de.leon@intel.com> --------- Signed-off-by: Daniel Deleon <daniel.de.leon@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> Co-authored-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> commit 4352636 Author: ZePan110 <ze.pan@intel.com> Date: Fri Feb 21 17:14:08 2025 +0800 Fix trivy issue in Dockerfile (opea-project#1304) Signed-off-by: ZePan110 <ze.pan@intel.com> commit 135ef91 Author: rbrugaro <rita.brugarolas.brufau@intel.com> Date: Thu Feb 20 15:29:39 2025 -0800 Change neo4j Bolt default PORT from 7687 to $NEO4J_PORT2 (opea-project#1292) * Change neo4j Bolt default PORT from 7687 to -configured the port in neo4j compose.yaml to use variable value -made all corresponding changes in neo4j dataprep and retriever components and test scripts to use env variable instead of default port value. Signed-off-by: rbrugaro <rita.brugarolas.brufau@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * missing positional arg in milvus dataprep Signed-off-by: rbrugaro <rita.brugarolas.brufau@intel.com> * remove redundance in stop_docker Signed-off-by: rbrugaro <rita.brugarolas.brufau@intel.com> * resolve retriever to neo4j connectivity issue bad URL Signed-off-by: rbrugaro <rita.brugarolas.brufau@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * set neo4j ports to neo4j defaults and fix environment variables in READMEs Signed-off-by: rbrugaro <rita.brugarolas.brufau@intel.com> --------- Signed-off-by: rbrugaro <rita.brugarolas.brufau@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> commit a4f6af1 Author: Letong Han <106566639+letonghan@users.noreply.github.com> Date: Thu Feb 20 13:38:01 2025 +0800 Refine dataprep test scripts (opea-project#1305) * Refine dataprep Milvus CI Signed-off-by: letonghan <letong.han@intel.com> commit 2102a8e Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Thu Feb 20 10:46:19 2025 +0800 Bump transformers (opea-project#1278) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> commit a033c05 Author: Liang Lv <liang1.lv@intel.com> Date: Wed Feb 19 14:19:02 2025 +0800 Fix milvus dataprep ingest files failure (opea-project#1299) Signed-off-by: lvliang-intel <liang1.lv@intel.com> Co-authored-by: Letong Han <106566639+letonghan@users.noreply.github.com> commit 022d052 Author: lkk <33276950+lkk12014402@users.noreply.github.com> Date: Wed Feb 19 09:50:59 2025 +0800 fix agent message format. (opea-project#1297) 1. set default session_id for react_langchain strategy, because the langchain version upgrade. 2. fix request message format commit 7727235 Author: Liang Lv <liang1.lv@intel.com> Date: Tue Feb 18 20:55:20 2025 +0800 Refine CLIP embedding microservice by leveraging the third-party CLIP (opea-project#1298) * Refine CLI embedding microservice using dependency Signed-off-by: lvliang-intel <liang1.lv@intel.com> commit a353f99 Author: Spycsh <39623753+Spycsh@users.noreply.github.com> Date: Mon Feb 17 11:35:38 2025 +0800 Fix telemetry connection issue when disabling telemetry (opea-project#1290) * Fix telemetry connection issue when disabling telemetry - use ENABLE_OPEA_TELEMETRY to control whether to enable open telemetry, default false - fix the issue that logs always show telemetry connection error with each request when telemetry is disabled - ban the above error propagation to microservices when telemetry is disabled Signed-off-by: Spycsh <sihan.chen@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix ut failure where required the flag to be on * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Spycsh <sihan.chen@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> commit 7c2e7f6 Author: xiguiw <111278656+xiguiw@users.noreply.github.com> Date: Sat Feb 15 11:25:15 2025 +0800 update vLLM CPU to latest tag (opea-project#1285) Get the latest vLLM stable version. Signed-off-by: Wang, Xigui <xigui.wang@intel.com> commit c3c8497 Author: Letong Han <106566639+letonghan@users.noreply.github.com> Date: Fri Feb 14 22:29:38 2025 +0800 Fix Qdrant retriever RAG issue. (opea-project#1289) * Fix Qdrant retriever no retrieved result issue. Signed-off-by: letonghan <letong.han@intel.com> commit 47f68a4 Author: Letong Han <106566639+letonghan@users.noreply.github.com> Date: Fri Feb 14 20:29:27 2025 +0800 Fix the retriever issue of Milvus (opea-project#1286) * Fix the retriever issue of Milvus DB that data can not be retrieved after ingested using dataprep. Signed-off-by: letonghan <letong.han@intel.com> --------- Signed-off-by: letonghan <letong.han@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> commit 0e3f8ab Author: minmin-intel <minmin.hou@intel.com> Date: Thu Feb 13 20:24:02 2025 -0800 Improve multi-turn capability for agent (opea-project#1248) * first code for multi-turn Signed-off-by: minmin-intel <minmin.hou@intel.com> * test redispersistence Signed-off-by: minmin-intel <minmin.hou@intel.com> * integrate persistent store in react llama Signed-off-by: minmin-intel <minmin.hou@intel.com> * test multi-turn Signed-off-by: minmin-intel <minmin.hou@intel.com> * multiturn for assistants api and chatcompletion api Signed-off-by: minmin-intel <minmin.hou@intel.com> * update readme and ut script Signed-off-by: minmin-intel <minmin.hou@intel.com> * update readme and ut scripts Signed-off-by: minmin-intel <minmin.hou@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix bug Signed-off-by: minmin-intel <minmin.hou@intel.com> * change memory type naming Signed-off-by: minmin-intel <minmin.hou@intel.com> * fix with_memory as str Signed-off-by: minmin-intel <minmin.hou@intel.com> --------- Signed-off-by: minmin-intel <minmin.hou@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> commit 4a90692 Author: rbrugaro <rita.brugarolas.brufau@intel.com> Date: Thu Feb 13 18:12:25 2025 -0800 Bug Fix neo4j dataprep ingest error handling and skip_ingestion argument passing (opea-project#1288) * Fix dataprpe ingest error handling and skip_ingestion argument passing in dataprep neo4j integration Signed-off-by: rbrugaro <rita.brugarolas.brufau@intel.com> commit d1dfd0e Author: Spycsh <39623753+Spycsh@users.noreply.github.com> Date: Thu Feb 13 22:39:47 2025 +0800 Align mongo related chathistory/feedbackmanagement/promptregistry image names with examples (opea-project#1284) Align mongo related chathistory/feedbackmanagement/promptregistry image names with examples Signed-off-by: Spycsh <sihan.chen@intel.com> Co-authored-by: Liang Lv <liang1.lv@intel.com> commit bef501c Author: Liang Lv <liang1.lv@intel.com> Date: Thu Feb 13 21:18:58 2025 +0800 Fix VDMS retrieval issue (opea-project#1252) * Fix VDMS retrieval issue Signed-off-by: lvliang-intel <liang1.lv@intel.com> commit 23b2be2 Author: ZePan110 <ze.pan@intel.com> Date: Thu Feb 13 16:07:14 2025 +0800 Fix Build latest images on push event workflow (opea-project#1282) Signed-off-by: ZePan110 <ze.pan@intel.com> commit f8e6216 Author: Spycsh <39623753+Spycsh@users.noreply.github.com> Date: Wed Feb 12 15:45:14 2025 +0800 fix metric id issue when init multiple Orchestrator instance (opea-project#1280) Signed-off-by: Spycsh <sihan.chen@intel.com> commit d3906ce Author: chen, suyue <suyue.chen@intel.com> Date: Wed Feb 12 14:56:55 2025 +0800 update default service list (opea-project#1276) Signed-off-by: chensuyue <suyue.chen@intel.com> commit 17b9672 Author: XinyaoWa <xinyao.wang@intel.com> Date: Wed Feb 12 13:53:31 2025 +0800 Fix langchain and huggingface version to avoid bug in FaqGen and DocSum, remove vllm hpu triton version fix (opea-project#1275) * Fix langchain and huggingface version to avoid bug Signed-off-by: Xinyao Wang <xinyao.wang@intel.com> commit b777db7 Author: Letong Han <106566639+letonghan@users.noreply.github.com> Date: Mon Feb 10 16:00:55 2025 +0800 Fix Dataprep Ingest Data Issue. (opea-project#1271) * Fix Dataprep Ingest Data Issue. Trace: 1. The update of `langchain_huggingface.HuggingFaceEndpointEmbeddings` caused the wrong size of embedding vectors. 2. Wrong size vectors are wrongly saved into Redis database, and the indices are not created correctly. 3. The retriever can not retrieve data from Redis using index due to the reasons above. 4. Then the RAG seems `not work`, for the file uploaded can not be retrieved from database. Solution: Replace all of the `langchain_huggingface.HuggingFaceEndpointEmbeddings` to `langchain_community.embeddings.HuggingFaceInferenceAPIEmbeddings`, and modify related READMEs and scirpts. Related issue: - opea-project/GenAIExamples#1473 - opea-project/GenAIExamples#1482 --------- Signed-off-by: letonghan <letong.han@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> commit 0df374b Author: Daniel De León <111013930+daniel-de-leon-user293@users.noreply.github.com> Date: Sun Feb 9 22:01:58 2025 -0800 Update docs for LLamaGuard & WildGuard Microservice (opea-project#1259) * working README for CLI and compose Signed-off-by: Daniel Deleon <daniel.de.leon@intel.com> * update for direct python execution Signed-off-by: Daniel Deleon <daniel.de.leon@intel.com> * fix formatting Signed-off-by: Daniel Deleon <daniel.de.leon@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bring back depends_on condition Signed-off-by: Daniel Deleon <daniel.de.leon@intel.com> --------- Signed-off-by: Daniel Deleon <daniel.de.leon@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> commit fb86b5e Author: Louie Tsai <louie.tsai@intel.com> Date: Sat Feb 8 00:58:33 2025 -0800 Add Deepseek model into validated model table and add required Gaudi cards for LLM microservice (opea-project#1267) * Update README.md for Deepseek support and numbers of required gaudi cards Signed-off-by: Tsai, Louie <louie.tsai@intel.com> * Update README.md Signed-off-by: Tsai, Louie <louie.tsai@intel.com> --------- Signed-off-by: Tsai, Louie <louie.tsai@intel.com> commit ecb7f7b Author: Spycsh <39623753+Spycsh@users.noreply.github.com> Date: Fri Feb 7 16:58:22 2025 +0800 Fix web-retrievers hub client and tei endpoint issue (opea-project#1270) * fix web-retrievers hub client and tei endpoint issue Signed-off-by: Spycsh <sihan.chen@intel.com> commit 5baada8 Author: ZePan110 <ze.pan@intel.com> Date: Thu Feb 6 15:03:00 2025 +0800 Fix CD test issue. (opea-project#1263) 1.Fix template name in README 2.Fix invalid release name Signed-off-by: ZePan110 <ze.pan@intel.com> commit fa01f46 Author: minmin-intel <minmin.hou@intel.com> Date: Wed Feb 5 13:57:57 2025 -0800 fix tei embedding and tei reranking bug (opea-project#1256) Signed-off-by: minmin-intel <minmin.hou@intel.com> Co-authored-by: Abolfazl Shahbazi <12436063+ashahba@users.noreply.github.com> commit 4ede405 Author: Eero Tamminen <eero.t.tamminen@intel.com> Date: Wed Feb 5 22:04:50 2025 +0200 Create token metrics only when they are available (opea-project#1092) * Create token metrics only when they are available This avoids generation of useless token/request histogram metrics for services that use Orchestrator class, but never call its token processing functionality. (Helps in differentiating frontend megaservice metrics from backend megaservice ones, especially when multiple OPEA applications run in the same cluster.) Also change Orchestrator CI test workaround to use unique prefix for each metric instance, instead of metrics being (singleton) class variables. Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com> * Add locking for latency metric creation / method change As that that could be called from multiple request handling threads. Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com> --------- Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com> Co-authored-by: Malini Bhandaru <malini.bhandaru@intel.com>
1 parent 142005f commit 14f509f

File tree

139 files changed

+1918
-839
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

139 files changed

+1918
-839
lines changed

.github/workflows/_comps-workflow.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,6 @@ jobs:
7171
fi
7272
if [[ $(grep -c "vllm-gaudi:" ${docker_compose_yml}) != 0 ]]; then
7373
git clone --depth 1 --branch v0.6.4.post2+Gaudi-1.19.0 https://github.com/HabanaAI/vllm-fork.git
74-
sed -i 's/triton/triton==3.1.0/g' vllm-fork/requirements-hpu.txt
7574
fi
7675
- name: Get build list
7776
id: get-build-list

.github/workflows/_run-helm-chart.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -134,8 +134,9 @@ jobs:
134134
if [[ "${service,,}" == *"third_parties"* ]]; then
135135
CHART_NAME="$(echo "${service,,}"|cut -d'/' -f2)" # bridgetower
136136
else
137-
CHART_NAME="${service_name}" # agent
137+
CHART_NAME="${service_name}" # web_retrievers
138138
fi
139+
CHART_NAME=$(echo "$CHART_NAME" | tr -cd 'a-z0-9')
139140
echo "service_name=$service_name" >> $GITHUB_ENV
140141
echo "CHART_NAME=$CHART_NAME" >> $GITHUB_ENV
141142
echo "RELEASE_NAME=${CHART_NAME}$(date +%d%H%M%S)" >> $GITHUB_ENV

.github/workflows/docker/compose/chathistory-compose.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33

44
# this file should be run in the root of the repo
55
services:
6-
chathistory-mongo-server:
6+
chathistory-mongo:
77
build:
88
dockerfile: comps/chathistory/src/Dockerfile
9-
image: ${REGISTRY:-opea}/chathistory-mongo-server:${TAG:-latest}
9+
image: ${REGISTRY:-opea}/chathistory-mongo:${TAG:-latest}

.github/workflows/manual-comps-test.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ on:
77
inputs:
88
services:
99
default: "asr"
10-
description: "List of services to test [agent,asr,chathistory,dataprep,embeddings,feedback_management,finetuning,guardrails,knowledgegraphs,llms,lvms,nginx,prompt_registry,ragas,rerankings,retrievers,tts,web_retrievers]"
10+
description: "List of services to test [agent,asr,chathistory,animation,dataprep,embeddings,feedback_management,finetuning,guardrails,image2image,image2video,intent_detection,llms,lvms,prompt_registry,ragas,rerankings,retrievers,text2image,text2sql,third_parties,tts,vectorstores,web_retrievers]"
1111
required: true
1212
type: string
1313
build:

.github/workflows/manual-docker-publish.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ on:
77
inputs:
88
services:
99
default: ""
10-
description: "List of services to test [agent,asr,chathistory,dataprep,embeddings,feedback_management,finetuning,guardrails,knowledgegraphs,llms,lvms,nginx,prompt_registry,ragas,rerankings,retrievers,tts,web_retrievers]"
10+
description: "List of services to test [agent,asr,chathistory,animation,dataprep,embeddings,feedback_management,finetuning,guardrails,image2image,image2video,intent_detection,llms,lvms,prompt_registry,ragas,rerankings,retrievers,text2image,text2sql,third_parties,tts,vectorstores,web_retrievers]"
1111
required: false
1212
type: string
1313
images:

.github/workflows/push-image-build.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ jobs:
6262

6363
image-build:
6464
needs: get-build-matrix
65-
if: ${{ fromJSON(needs.get-build-matrix.outputs.services).length != 0 }}
65+
if: needs.get-build-matrix.outputs.services != '[]'
6666
strategy:
6767
matrix:
6868
service: ${{ fromJSON(needs.get-build-matrix.outputs.services) }}
@@ -96,7 +96,6 @@ jobs:
9696
fi
9797
if [[ $(grep -c "vllm-gaudi:" ${docker_compose_path}) != 0 ]]; then
9898
git clone --depth 1 --branch v0.6.4.post2+Gaudi-1.19.0 https://github.com/HabanaAI/vllm-fork.git
99-
sed -i 's/triton/triton==3.1.0/g' vllm-fork/requirements-hpu.txt
10099
fi
101100
102101
- name: Build Image

README.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -61,8 +61,6 @@ The initially supported `Microservices` are described in the below table. More `
6161
A `Microservices` can be created by using the decorator `register_microservice`. Taking the `embedding microservice` as an example:
6262

6363
```python
64-
from langchain_community.embeddings import HuggingFaceHubEmbeddings
65-
6664
from comps import register_microservice, EmbedDoc, ServiceType, TextDoc
6765

6866

comps/__init__.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,8 +51,7 @@
5151
from comps.cores.mega.micro_service import MicroService, register_microservice, opea_microservices
5252

5353
# Telemetry
54-
if os.getenv("ENABLE_OPEA_TELEMETRY", "false").lower() == "true":
55-
from comps.cores.telemetry.opea_telemetry import opea_telemetry
54+
from comps.cores.telemetry.opea_telemetry import opea_telemetry
5655

5756
# Common
5857
from comps.cores.common.component import OpeaComponent, OpeaComponentRegistry, OpeaComponentLoader

comps/agent/src/README.md

Lines changed: 90 additions & 49 deletions
Large diffs are not rendered by default.

comps/agent/src/agent.py

Lines changed: 41 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
from comps.agent.src.integrations.agent import instantiate_agent
1919
from comps.agent.src.integrations.global_var import assistants_global_kv, threads_global_kv
2020
from comps.agent.src.integrations.thread import instantiate_thread_memory, thread_completion_callback
21-
from comps.agent.src.integrations.utils import assemble_store_messages, get_args
21+
from comps.agent.src.integrations.utils import assemble_store_messages, get_args, get_latest_human_message_from_store
2222
from comps.cores.proto.api_protocol import (
2323
AssistantsObject,
2424
ChatCompletionRequest,
@@ -40,7 +40,7 @@
4040

4141
logger.info("========initiating agent============")
4242
logger.info(f"args: {args}")
43-
agent_inst = instantiate_agent(args, args.strategy, with_memory=args.with_memory)
43+
agent_inst = instantiate_agent(args)
4444

4545

4646
class AgentCompletionRequest(ChatCompletionRequest):
@@ -76,7 +76,7 @@ async def llm_generate(input: AgentCompletionRequest):
7676
if isinstance(input.messages, str):
7777
messages = input.messages
7878
else:
79-
# TODO: need handle multi-turn messages
79+
# last user message
8080
messages = input.messages[-1]["content"]
8181

8282
# 2. prepare the input for the agent
@@ -90,7 +90,6 @@ async def llm_generate(input: AgentCompletionRequest):
9090
else:
9191
logger.info("-----------NOT STREAMING-------------")
9292
response = await agent_inst.non_streaming_run(messages, config)
93-
logger.info("-----------Response-------------")
9493
return GeneratedDoc(text=response, prompt=messages)
9594

9695

@@ -100,14 +99,14 @@ class RedisConfig(BaseModel):
10099

101100
class AgentConfig(BaseModel):
102101
stream: Optional[bool] = False
103-
agent_name: Optional[str] = "OPEA_Default_Agent"
102+
agent_name: Optional[str] = "OPEA_Agent"
104103
strategy: Optional[str] = "react_llama"
105-
role_description: Optional[str] = "LLM enhanced agent"
104+
role_description: Optional[str] = "AI assistant"
106105
tools: Optional[str] = None
107106
recursion_limit: Optional[int] = 5
108107

109-
model: Optional[str] = "meta-llama/Meta-Llama-3-8B-Instruct"
110-
llm_engine: Optional[str] = None
108+
model: Optional[str] = "meta-llama/Llama-3.3-70B-Instruct"
109+
llm_engine: Optional[str] = "vllm"
111110
llm_endpoint_url: Optional[str] = None
112111
max_new_tokens: Optional[int] = 1024
113112
top_k: Optional[int] = 10
@@ -117,10 +116,14 @@ class AgentConfig(BaseModel):
117116
return_full_text: Optional[bool] = False
118117
custom_prompt: Optional[str] = None
119118

120-
# short/long term memory
121-
with_memory: Optional[bool] = False
122-
# persistence
123-
with_store: Optional[bool] = False
119+
# # short/long term memory
120+
with_memory: Optional[bool] = True
121+
# agent memory config
122+
# chat_completion api: only supports checkpointer memory
123+
# assistants api: supports checkpointer and store memory
124+
# checkpointer: in-memory checkpointer - MemorySaver()
125+
# store: redis store
126+
memory_type: Optional[str] = "checkpointer" # choices: checkpointer, store
124127
store_config: Optional[RedisConfig] = None
125128

126129
timeout: Optional[int] = 60
@@ -147,18 +150,17 @@ class CreateAssistant(CreateAssistantsRequest):
147150
)
148151
def create_assistants(input: CreateAssistant):
149152
# 1. initialize the agent
150-
agent_inst = instantiate_agent(
151-
input.agent_config, input.agent_config.strategy, with_memory=input.agent_config.with_memory
152-
)
153+
print("@@@ Initializing agent with config: ", input.agent_config)
154+
agent_inst = instantiate_agent(input.agent_config)
153155
assistant_id = agent_inst.id
154156
created_at = int(datetime.now().timestamp())
155157
with assistants_global_kv as g_assistants:
156158
g_assistants[assistant_id] = (agent_inst, created_at)
157159
logger.info(f"Record assistant inst {assistant_id} in global KV")
158160

159-
if input.agent_config.with_store:
161+
if input.agent_config.memory_type == "store":
160162
logger.info("Save Agent Config to database")
161-
agent_inst.with_store = input.agent_config.with_store
163+
# agent_inst.memory_type = input.agent_config.memory_type
162164
print(input)
163165
global db_client
164166
if db_client is None:
@@ -172,6 +174,7 @@ def create_assistants(input: CreateAssistant):
172174
return AssistantsObject(
173175
id=assistant_id,
174176
created_at=created_at,
177+
model=input.agent_config.model,
175178
)
176179

177180

@@ -211,7 +214,7 @@ def create_messages(thread_id, input: CreateMessagesRequest):
211214
if isinstance(input.content, str):
212215
query = input.content
213216
else:
214-
query = input.content[-1]["text"]
217+
query = input.content[-1]["text"] # content is a list of MessageContent
215218
msg_id, created_at = thread_inst.add_query(query)
216219

217220
structured_content = MessageContent(text=query)
@@ -224,15 +227,18 @@ def create_messages(thread_id, input: CreateMessagesRequest):
224227
assistant_id=input.assistant_id,
225228
)
226229

227-
# save messages using assistant_id as key
230+
# save messages using assistant_id_thread_id as key
228231
if input.assistant_id is not None:
229232
with assistants_global_kv as g_assistants:
230233
agent_inst, _ = g_assistants[input.assistant_id]
231-
if agent_inst.with_store:
232-
logger.info(f"Save Agent Messages, assistant_id: {input.assistant_id}, thread_id: {thread_id}")
234+
if agent_inst.memory_type == "store":
235+
logger.info(f"Save Messages, assistant_id: {input.assistant_id}, thread_id: {thread_id}")
233236
# if with store, db_client initialized already
234237
global db_client
235-
db_client.put(msg_id, message.model_dump_json(), input.assistant_id)
238+
namespace = f"{input.assistant_id}_{thread_id}"
239+
# put(key: str, val: dict, collection: str = DEFAULT_COLLECTION)
240+
db_client.put(msg_id, message.model_dump_json(), namespace)
241+
logger.info(f"@@@ Save message to db: {msg_id}, {message.model_dump_json()}, {namespace}")
236242

237243
return message
238244

@@ -254,15 +260,24 @@ def create_run(thread_id, input: CreateRunResponse):
254260
with assistants_global_kv as g_assistants:
255261
agent_inst, _ = g_assistants[assistant_id]
256262

257-
config = {"recursion_limit": args.recursion_limit}
263+
config = {
264+
"recursion_limit": args.recursion_limit,
265+
"configurable": {"session_id": thread_id, "thread_id": thread_id, "user_id": assistant_id},
266+
}
258267

259-
if agent_inst.with_store:
260-
# assemble multi-turn messages
268+
if agent_inst.memory_type == "store":
261269
global db_client
262-
input_query = assemble_store_messages(db_client.get_all(assistant_id))
270+
namespace = f"{assistant_id}_{thread_id}"
271+
# get the latest human message from store in the namespace
272+
input_query = get_latest_human_message_from_store(db_client, namespace)
273+
print("@@@@ Input_query from store: ", input_query)
263274
else:
264275
input_query = thread_inst.get_query()
276+
print("@@@@ Input_query from thread_inst: ", input_query)
265277

278+
print("@@@ Agent instance:")
279+
print(agent_inst.id)
280+
print(agent_inst.args)
266281
try:
267282
return StreamingResponse(
268283
thread_completion_callback(agent_inst.stream_generator(input_query, config, thread_id), thread_id),

comps/agent/src/integrations/agent.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,13 @@
11
# Copyright (C) 2024 Intel Corporation
22
# SPDX-License-Identifier: Apache-2.0
3+
from .storage.persistence_redis import RedisPersistence
34
from .utils import load_python_prompt
45

56

6-
def instantiate_agent(args, strategy="react_langchain", with_memory=False):
7+
def instantiate_agent(args):
8+
strategy = args.strategy
9+
with_memory = args.with_memory
10+
711
if args.custom_prompt is not None:
812
print(f">>>>>> custom_prompt enabled, {args.custom_prompt}")
913
custom_prompt = load_python_prompt(args.custom_prompt)
@@ -22,7 +26,7 @@ def instantiate_agent(args, strategy="react_langchain", with_memory=False):
2226
print("Initializing ReAct Agent with LLAMA")
2327
from .strategy.react import ReActAgentLlama
2428

25-
return ReActAgentLlama(args, with_memory, custom_prompt=custom_prompt)
29+
return ReActAgentLlama(args, custom_prompt=custom_prompt)
2630
elif strategy == "plan_execute":
2731
from .strategy.planexec import PlanExecuteAgentWithLangGraph
2832

comps/agent/src/integrations/strategy/base_agent.py

Lines changed: 20 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,9 @@
33

44
from uuid import uuid4
55

6+
from langgraph.checkpoint.memory import MemorySaver
7+
8+
from ..storage.persistence_redis import RedisPersistence
69
from ..tools import get_tools_descriptions
710
from ..utils import adapt_custom_prompt, setup_chat_model
811

@@ -12,11 +15,25 @@ def __init__(self, args, local_vars=None, **kwargs) -> None:
1215
self.llm = setup_chat_model(args)
1316
self.tools_descriptions = get_tools_descriptions(args.tools)
1417
self.app = None
15-
self.memory = None
1618
self.id = f"assistant_{self.__class__.__name__}_{uuid4()}"
1719
self.args = args
1820
adapt_custom_prompt(local_vars, kwargs.get("custom_prompt"))
19-
print(self.tools_descriptions)
21+
print("Registered tools: ", self.tools_descriptions)
22+
23+
if args.with_memory:
24+
if args.memory_type == "checkpointer":
25+
self.memory_type = "checkpointer"
26+
self.checkpointer = MemorySaver()
27+
self.store = None
28+
elif args.memory_type == "store":
29+
# print("Using Redis as store: ", args.store_config.redis_uri)
30+
self.store = RedisPersistence(args.store_config.redis_uri)
31+
self.memory_type = "store"
32+
else:
33+
raise ValueError("Invalid memory type!")
34+
else:
35+
self.store = None
36+
self.checkpointer = None
2037

2138
@property
2239
def is_vllm(self):
@@ -60,10 +77,7 @@ async def non_streaming_run(self, query, config):
6077
try:
6178
async for s in self.app.astream(initial_state, config=config, stream_mode="values"):
6279
message = s["messages"][-1]
63-
if isinstance(message, tuple):
64-
print(message)
65-
else:
66-
message.pretty_print()
80+
message.pretty_print()
6781

6882
last_message = s["messages"][-1]
6983
print("******Response: ", last_message.content)

0 commit comments

Comments
 (0)