diff --git a/FinanceAgent/README.md b/FinanceAgent/README.md index c652e06e02..5c7c0b3b41 100644 --- a/FinanceAgent/README.md +++ b/FinanceAgent/README.md @@ -2,14 +2,20 @@ ## 1. Overview -The architecture of this Finance Agent example is shown in the figure below. The agent has 3 main functions: +The architecture of this Finance Agent example is shown in the figure below. The agent is a hierarchical multi-agent system and has 3 main functions: -1. Summarize long financial documents and provide key points. -2. Answer questions over financial documents, such as SEC filings. -3. Conduct research of a public company and provide an investment report of the company. +1. Summarize long financial documents and provide key points (using OPEA DocSum). +2. Answer questions over financial documents, such as SEC filings (using a worker agent). +3. Conduct research of a public company and provide an investment report of the company (using a worker agent). + +The user interacts with the supervisor agent through the graphical UI. The supervisor agent gets the requests from the user and dispatches tasks to worker agents or to the summarization microservice. The user can also uploads documents through the UI. ![Finance Agent Architecture](assets/finance_agent_arch.png) +The architectural diagram of the `dataprep` microservice is shown below. We use [docling](https://github.com/docling-project/docling) to extract text from PDFs and URLs into markdown format. Both the full document content and tables are extracted. We then use an LLM to extract metadata from the document, including the company name, year, quarter, document type, and document title. The full document markdown then gets chunked, and LLM is used to summarize each chunk, and the summaries are embedded and saved to a vector database. Each table is also summarized by LLM and the summaries are embedded and saved to the vector database. The chunks and tables are also saved into a KV store. The pipeline is designed as such to improve retrieval accuracy of the `search_knowledge_base` tool used by the Question Answering worker agent. + +![dataprep architecture](assets/fin_agent_dataprep.png) + The `dataprep` microservice can ingest financial documents in two formats: 1. PDF documents stored locally, such as SEC filings saved in local directory. @@ -20,6 +26,10 @@ Please note: 1. Each financial document should be about one company. 2. URLs ending in `.htm` are not supported. +The Question Answering worker agent uses `search_knowledge_base` tool to get relevant information. The tool uses a dense retriever and a BM25 retriever to get many pieces of information including financial statement tables. Then an LLM is used to extract useful information related to the query from the retrieved documents. Refer to the diagram below. We found that using this method significantly improves agent performance. + +![finqa search tool arch](assets/finqa_tool.png) + ## 2. Getting started ### 2.1 Download repos @@ -27,8 +37,8 @@ Please note: ```bash mkdir /path/to/your/workspace/ export WORKDIR=/path/to/your/workspace/ -genaicomps -genaiexamples +cd $WORKDIR +git clone https://github.com/opea-project/GenAIExamples.git ``` ### 2.2 Set up env vars @@ -36,15 +46,19 @@ genaiexamples ```bash export HF_CACHE_DIR=/path/to/your/model/cache/ export HF_TOKEN= - +export FINNHUB_API_KEY= # go to https://finnhub.io/ to get your free api key +export FINANCIAL_DATASETS_API_KEY= # go to https://docs.financialdatasets.ai/ to get your free api key ``` -### 2.3 Build docker images +### 2.3 [Optional] Build docker images -Build docker images for dataprep, agent, agent-ui. +Only needed when docker pull failed. ```bash -cd GenAIExamples/FinanceAgent/docker_image_build +cd $WORKDIR/GenAIExamples/FinanceAgent/docker_image_build +# get GenAIComps repo +git clone https://github.com/opea-project/GenAIComps.git +# build the images docker compose -f build.yaml build --no-cache ``` @@ -92,6 +106,8 @@ python $WORKPATH/tests/test_redis_finance.py --port 6007 --test_option get ### 3.3 Launch the multi-agent system +The command below will launch 3 agent microservices, 1 docsum microservice, 1 UI microservice. + ```bash # inside $WORKDIR/GenAIExamples/FinanceAgent/docker_compose/intel/hpu/gaudi/ bash launch_agents.sh @@ -115,14 +131,14 @@ prompt="generate NVDA financial research report" python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port --tool_choice "get_current_date" --tool_choice "get_share_performance" ``` -Supervisor ReAct Agent: +Supervisor Agent single turns: ```bash export agent_port="9090" python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --stream ``` -Supervisor ReAct Agent Multi turn: +Supervisor Agent multi turn: ```bash python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --multi-turn --stream @@ -134,12 +150,32 @@ python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervis The UI microservice is launched in the previous step with the other microservices. To see the UI, open a web browser to `http://${ip_address}:5175` to access the UI. Note the `ip_address` here is the host IP of the UI microservice. -1. `create Admin Account` with a random value +1. Create Admin Account with a random value + +2. Enter the endpoints in the `Connections` settings + + First, click on the user icon in the upper right corner to open `Settings`. Click on `Admin Settings`. Click on `Connections`. + + Then, enter the supervisor agent endpoint in the `OpenAI API` section: `http://${ip_address}:9090/v1`. Enter the API key as "empty". Add an arbitrary model id in `Model IDs`, for example, "opea_agent". The `ip_address` here should be the host ip of the agent microservice. + + Then, enter the dataprep endpoint in the `Icloud File API` section. You first need to enable `Icloud File API` by clicking on the button on the right to turn it into green and then enter the endpoint url, for example, `http://${ip_address}:6007/v1`. The `ip_address` here should be the host ip of the dataprep microservice. + + You should see screen like the screenshot below when the settings are done. + +![opea-agent-setting](assets/ui_connections_settings.png) + +3. Upload documents with UI + + Click on the `Workplace` icon in the top left corner. Click `Knowledge`. Click on the "+" sign to the right of `Icloud Knowledge`. You can paste an url in the left hand side of the pop-up window, or upload a local file by click on the cloud icon on the right hand side of the pop-up window. Then click on the `Upload Confirm` button. Wait till the processing is done and the pop-up window will be closed on its own when the data ingestion is done. See the screenshot below. + + Note: the data ingestion may take a few minutes depending on the length of the document. Please wait patiently and do not close the pop-up window. + +![upload-doc-ui](assets/upload_doc_ui.png) -2. use an opea agent endpoint, for example, the `Research Agent` endpoint `http://$ip_address:9096/v1`, which is a openai compatible api +4. Test agent with UI -![opea-agent-setting](assets/opea-agent-setting.png) + After the settings are done and documents are ingested, you can start to ask questions to the agent. Click on the `New Chat` icon in the top left corner, and type in your questions in the text box in the middle of the UI. -3. test opea agent with ui + The UI will stream the agent's response tokens. You need to expand the `Thinking` tab to see the agent's reasoning process. After the agent made tool calls, you would also see the tool output after the tool returns output to the agent. Note: it may take a while to get the tool output back if the tool execution takes time. ![opea-agent-test](assets/opea-agent-test.png) diff --git a/FinanceAgent/assets/fin_agent_dataprep.png b/FinanceAgent/assets/fin_agent_dataprep.png new file mode 100644 index 0000000000..33ec739670 Binary files /dev/null and b/FinanceAgent/assets/fin_agent_dataprep.png differ diff --git a/FinanceAgent/assets/finance_agent_arch.png b/FinanceAgent/assets/finance_agent_arch.png index 5f0f1ad5b0..8a2a863608 100644 Binary files a/FinanceAgent/assets/finance_agent_arch.png and b/FinanceAgent/assets/finance_agent_arch.png differ diff --git a/FinanceAgent/assets/finqa_tool.png b/FinanceAgent/assets/finqa_tool.png new file mode 100644 index 0000000000..d8f9c30cae Binary files /dev/null and b/FinanceAgent/assets/finqa_tool.png differ diff --git a/FinanceAgent/assets/ui_connections_settings.png b/FinanceAgent/assets/ui_connections_settings.png new file mode 100644 index 0000000000..a68df00075 Binary files /dev/null and b/FinanceAgent/assets/ui_connections_settings.png differ diff --git a/FinanceAgent/assets/upload_doc_ui.png b/FinanceAgent/assets/upload_doc_ui.png new file mode 100644 index 0000000000..4af6a1c756 Binary files /dev/null and b/FinanceAgent/assets/upload_doc_ui.png differ diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml b/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml index cc02847be9..997aade843 100644 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml +++ b/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml @@ -47,7 +47,7 @@ services: ip_address: ${ip_address} strategy: react_llama with_memory: false - recursion_limit: ${recursion_limit_worker} + recursion_limit: 25 llm_engine: vllm HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} llm_endpoint_url: ${LLM_ENDPOINT_URL} @@ -68,7 +68,7 @@ services: container_name: supervisor-agent-endpoint depends_on: - worker-finqa-agent - # - worker-research-agent + - worker-research-agent volumes: - ${TOOLSET_PATH}:/home/user/tools/ - ${PROMPT_PATH}:/home/user/prompts/ diff --git a/FinanceAgent/docker_image_build/build.yaml b/FinanceAgent/docker_image_build/build.yaml index 867cc3a8c4..7d113148a3 100644 --- a/FinanceAgent/docker_image_build/build.yaml +++ b/FinanceAgent/docker_image_build/build.yaml @@ -20,9 +20,3 @@ services: https_proxy: ${https_proxy} no_proxy: ${no_proxy} image: ${REGISTRY:-opea}/agent:${TAG:-latest} - # agent-ui: - # build: - # context: ../ui - # dockerfile: ./docker/Dockerfile - # extends: agent - # image: ${REGISTRY:-opea}/agent-ui:${TAG:-latest} diff --git a/FinanceAgent/prompts/research_prompt.py b/FinanceAgent/prompts/research_prompt.py index 7c3c925753..65bc8710cd 100644 --- a/FinanceAgent/prompts/research_prompt.py +++ b/FinanceAgent/prompts/research_prompt.py @@ -33,6 +33,7 @@ 4. Provide stock performance, because the financial report is used for stock investment analysis. 5. Read the execution history if any to understand the tools that have been called and the information that has been gathered. 6. Reason about the information gathered so far and decide if you can answer the question or if you need to call more tools. +7. Most of the tools need ticker symbol, use your knowledge to convert the company name to the ticker symbol if user only provides the company name. **Output format:** You should output your thought process: diff --git a/FinanceAgent/tests/test.py b/FinanceAgent/tests/test.py index a6cd69583c..2eb75cb06a 100644 --- a/FinanceAgent/tests/test.py +++ b/FinanceAgent/tests/test.py @@ -18,13 +18,13 @@ def process_request(url, query, is_stream=False): else: for line in resp.iter_lines(decode_unicode=True): print(line) - ret = None + ret = "Done" resp.raise_for_status() # Raise an exception for unsuccessful HTTP status codes return ret except requests.exceptions.RequestException as e: - ret = f"An error occurred:{e}" - return None + ret = f"ERROR OCCURRED IN TEST:{e}" + return ret def test_worker_agent(args): @@ -35,6 +35,11 @@ def test_worker_agent(args): query = {"role": "user", "messages": args.prompt, "stream": "false", "tool_choice": args.tool_choice} ret = process_request(url, query) print("Response: ", ret) + if "ERROR OCCURRED IN TEST" in ret.lower(): + print("Error in response, please check the server.") + return "ERROR OCCURRED IN TEST" + else: + return "test completed with success" def add_message_and_run(url, user_message, thread_id, stream=False): @@ -42,6 +47,7 @@ def add_message_and_run(url, user_message, thread_id, stream=False): query = {"role": "user", "messages": user_message, "thread_id": thread_id, "stream": stream} ret = process_request(url, query, is_stream=stream) print("Response: ", ret) + return ret def test_chat_completion_multi_turn(args): @@ -51,14 +57,21 @@ def test_chat_completion_multi_turn(args): # first turn print("===============First turn==================") user_message = "Key takeaways of Gap's 2024 Q4 earnings call?" - add_message_and_run(url, user_message, thread_id, stream=args.stream) + ret = add_message_and_run(url, user_message, thread_id, stream=args.stream) + if "ERROR OCCURRED IN TEST" in ret: + print("Error in response, please check the server.") + return "ERROR OCCURRED IN TEST" print("===============End of first turn==================") # second turn print("===============Second turn==================") user_message = "What was Gap's forecast for 2025?" - add_message_and_run(url, user_message, thread_id, stream=args.stream) + ret = add_message_and_run(url, user_message, thread_id, stream=args.stream) + if "ERROR OCCURRED IN TEST" in ret: + print("Error in response, please check the server.") + return "ERROR OCCURRED IN TEST" print("===============End of second turn==================") + return "test completed with success" def test_supervisor_agent_single_turn(args): @@ -66,12 +79,16 @@ def test_supervisor_agent_single_turn(args): query_list = [ "What was Gap's revenue growth in 2024?", "Can you summarize Costco's 2025 Q2 earnings call?", - # "Should I increase investment in Costco?", + "Should I increase investment in Johnson & Johnson?", ] for query in query_list: thread_id = f"{uuid.uuid4()}" - add_message_and_run(url, query, thread_id, stream=args.stream) + ret = add_message_and_run(url, query, thread_id, stream=args.stream) + if "ERROR OCCURRED IN TEST" in ret: + print("Error in response, please check the server.") + return "ERROR OCCURRED IN TEST" print("=" * 50) + return "test completed with success" if __name__ == "__main__": @@ -89,10 +106,12 @@ def test_supervisor_agent_single_turn(args): if args.agent_role == "supervisor": if args.multi_turn: - test_chat_completion_multi_turn(args) + ret = test_chat_completion_multi_turn(args) else: - test_supervisor_agent_single_turn(args) + ret = test_supervisor_agent_single_turn(args) + print(ret) elif args.agent_role == "worker": - test_worker_agent(args) + ret = test_worker_agent(args) + print(ret) else: raise ValueError("Invalid agent role") diff --git a/FinanceAgent/tests/test_compose_on_gaudi.sh b/FinanceAgent/tests/test_compose_on_gaudi.sh index 18615646b4..207dcc62f1 100644 --- a/FinanceAgent/tests/test_compose_on_gaudi.sh +++ b/FinanceAgent/tests/test_compose_on_gaudi.sh @@ -181,9 +181,9 @@ function validate_agent_service() { # # test worker research agent echo "======================Testing worker research agent======================" export agent_port="9096" - prompt="generate NVDA financial research report" + prompt="Johnson & Johnson" local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port --tool_choice "get_current_date" --tool_choice "get_share_performance") - local EXIT_CODE=$(validate "$CONTENT" "NVDA" "research-agent-endpoint") + local EXIT_CODE=$(validate "$CONTENT" "Johnson" "research-agent-endpoint") echo $CONTENT echo $EXIT_CODE local EXIT_CODE="${EXIT_CODE:0-1}" @@ -197,24 +197,24 @@ function validate_agent_service() { export agent_port="9090" local CONTENT=$(python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --stream) echo $CONTENT - # local EXIT_CODE=$(validate "$CONTENT" "" "react-agent-endpoint") - # echo $EXIT_CODE - # local EXIT_CODE="${EXIT_CODE:0-1}" - # if [ "$EXIT_CODE" == "1" ]; then - # docker logs react-agent-endpoint - # exit 1 - # fi - - echo "======================Testing supervisor agent: multi turns ======================" + local EXIT_CODE=$(validate "$CONTENT" "test completed with success" "supervisor-agent-endpoint") + echo $EXIT_CODE + local EXIT_CODE="${EXIT_CODE:0-1}" + if [ "$EXIT_CODE" == "1" ]; then + docker logs supervisor-agent-endpoint + exit 1 + fi + + # echo "======================Testing supervisor agent: multi turns ======================" local CONTENT=$(python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --multi-turn --stream) echo $CONTENT - # local EXIT_CODE=$(validate "$CONTENT" "" "react-agent-endpoint") - # echo $EXIT_CODE - # local EXIT_CODE="${EXIT_CODE:0-1}" - # if [ "$EXIT_CODE" == "1" ]; then - # docker logs react-agent-endpoint - # exit 1 - # fi + local EXIT_CODE=$(validate "$CONTENT" "test completed with success" "supervisor-agent-endpoint") + echo $EXIT_CODE + local EXIT_CODE="${EXIT_CODE:0-1}" + if [ "$EXIT_CODE" == "1" ]; then + docker logs supervisor-agent-endpoint + exit 1 + fi } @@ -237,7 +237,7 @@ stop_dataprep cd $WORKPATH/tests -# echo "=================== #1 Building docker images====================" +echo "=================== #1 Building docker images====================" build_vllm_docker_image build_dataprep_agent_images @@ -245,14 +245,14 @@ build_dataprep_agent_images # build_agent_image_local # echo "=================== #1 Building docker images completed====================" -# echo "=================== #2 Start vllm endpoint====================" +echo "=================== #2 Start vllm endpoint====================" start_vllm_service_70B -# echo "=================== #2 vllm endpoint started====================" +echo "=================== #2 vllm endpoint started====================" -# echo "=================== #3 Start dataprep and ingest data ====================" +echo "=================== #3 Start dataprep and ingest data ====================" start_dataprep ingest_validate_dataprep -# echo "=================== #3 Data ingestion and validation completed====================" +echo "=================== #3 Data ingestion and validation completed====================" echo "=================== #4 Start agents ====================" start_agents diff --git a/FinanceAgent/tools/research_agent_tools.yaml b/FinanceAgent/tools/research_agent_tools.yaml index bcf4e2bac2..69f8e35ae8 100644 --- a/FinanceAgent/tools/research_agent_tools.yaml +++ b/FinanceAgent/tools/research_agent_tools.yaml @@ -7,7 +7,7 @@ get_company_profile: args_schema: symbol: type: str - description: the company name or ticker symbol. + description: the ticker symbol. return_output: profile get_company_news: @@ -16,7 +16,7 @@ get_company_news: args_schema: symbol: type: str - description: the company name or ticker symbol. + description: the ticker symbol. start_date: type: str description: start date of the search period for the company's basic financials, yyyy-mm-dd. @@ -34,7 +34,7 @@ get_basic_financials_history: args_schema: symbol: type: str - description: the company name or ticker symbol. + description: the ticker symbol. freq: type: str description: reporting frequency of the company's basic financials, such as annual, quarterly. @@ -55,7 +55,7 @@ get_basic_financials: args_schema: symbol: type: str - description: the company name or ticker symbol. + description: the ticker symbol. selected_columns: type: list description: List of column names of news to return, should be chosen from 'assetTurnoverTTM', 'bookValue', 'cashRatio', 'currentRatio', 'ebitPerShare', 'eps', 'ev', 'fcfMargin', 'fcfPerShareTTM', 'grossMargin', 'inventoryTurnoverTTM', 'longtermDebtTotalAsset', 'longtermDebtTotalCapital', 'longtermDebtTotalEquity', 'netDebtToTotalCapital', 'netDebtToTotalEquity', 'netMargin', 'operatingMargin', 'payoutRatioTTM', 'pb', 'peTTM', 'pfcfTTM', 'pretaxMargin', 'psTTM', 'ptbv', 'quickRatio', 'receivablesTurnoverTTM', 'roaTTM', 'roeTTM', 'roicTTM', 'rotcTTM', 'salesPerShare', 'sgaToSale', 'tangibleBookValue', 'totalDebtToEquity', 'totalDebtToTotalAsset', 'totalDebtToTotalCapital', 'totalRatio','10DayAverageTradingVolume', '13WeekPriceReturnDaily', '26WeekPriceReturnDaily', '3MonthADReturnStd', '3MonthAverageTradingVolume', '52WeekHigh', '52WeekHighDate', '52WeekLow', '52WeekLowDate', '52WeekPriceReturnDaily', '5DayPriceReturnDaily', 'assetTurnoverAnnual', 'assetTurnoverTTM', 'beta', 'bookValuePerShareAnnual', 'bookValuePerShareQuarterly', 'bookValueShareGrowth5Y', 'capexCagr5Y', 'cashFlowPerShareAnnual', 'cashFlowPerShareQuarterly', 'cashFlowPerShareTTM', 'cashPerSharePerShareAnnual', 'cashPerSharePerShareQuarterly', 'currentDividendYieldTTM', 'currentEv/freeCashFlowAnnual', 'currentEv/freeCashFlowTTM', 'currentRatioAnnual', 'currentRatioQuarterly', 'dividendGrowthRate5Y', 'dividendPerShareAnnual', 'dividendPerShareTTM', 'dividendYieldIndicatedAnnual', 'ebitdPerShareAnnual', 'ebitdPerShareTTM', 'ebitdaCagr5Y', 'ebitdaInterimCagr5Y', 'enterpriseValue', 'epsAnnual', 'epsBasicExclExtraItemsAnnual', 'epsBasicExclExtraItemsTTM', 'epsExclExtraItemsAnnual', 'epsExclExtraItemsTTM', 'epsGrowth3Y', 'epsGrowth5Y', 'epsGrowthQuarterlyYoy', 'epsGrowthTTMYoy', 'epsInclExtraItemsAnnual', 'epsInclExtraItemsTTM', 'epsNormalizedAnnual', 'epsTTM', 'focfCagr5Y', 'grossMargin5Y', 'grossMarginAnnual', 'grossMarginTTM', 'inventoryTurnoverAnnual', 'inventoryTurnoverTTM', 'longTermDebt/equityAnnual', 'longTermDebt/equityQuarterly', 'marketCapitalization', 'monthToDatePriceReturnDaily', 'netIncomeEmployeeAnnual', 'netIncomeEmployeeTTM', 'netInterestCoverageAnnual', 'netInterestCoverageTTM', 'netMarginGrowth5Y', 'netProfitMargin5Y', 'netProfitMarginAnnual', 'netProfitMarginTTM', 'operatingMargin5Y'. @@ -72,7 +72,7 @@ analyze_balance_sheet: args_schema: symbol: type: str - description: the company name or ticker symbol. + description: the ticker symbol. period: type: str description: The period of the balance sheets, possible values such as annual, quarterly, ttm. Default is 'annual'. @@ -87,7 +87,7 @@ analyze_income_stmt: args_schema: symbol: type: str - description: the company name or ticker symbol. + description: the ticker symbol. period: type: str description: The period of the balance sheets, possible values, such as annual, quarterly, ttm. Default is 'annual'. @@ -102,7 +102,7 @@ analyze_cash_flow: args_schema: symbol: type: str - description: the company name or ticker symbol. + description: the ticker symbol. period: type: str description: The period of the balance sheets, possible values, such as annual, quarterly, ttm. Default is 'annual'. @@ -117,7 +117,7 @@ get_share_performance: args_schema: symbol: type: str - description: the company name or ticker symbol. + description: the ticker symbol. end_date: type: str description: end date of the search period for the company's basic financials, yyyy-mm-dd. diff --git a/FinanceAgent/tools/supervisor_agent_tools.yaml b/FinanceAgent/tools/supervisor_agent_tools.yaml index 0b451557e3..458f44e0fb 100644 --- a/FinanceAgent/tools/supervisor_agent_tools.yaml +++ b/FinanceAgent/tools/supervisor_agent_tools.yaml @@ -22,11 +22,11 @@ summarization_tool: description: Name of the company document belongs to return_output: summary -# research_agent: -# description: generate research report on a specified company with fundamentals analysis, sentiment analysis and risk analysis. -# callable_api: supervisor_tools.py:research_agent -# args_schema: -# company: -# type: str -# description: the company name -# return_output: report +research_agent: + description: generate research report on a specified company with fundamentals analysis, sentiment analysis and risk analysis. + callable_api: supervisor_tools.py:research_agent + args_schema: + company: + type: str + description: the company name + return_output: report