Skip to content

RetrievalAugmentationAdvisor appears to suppress tool usage #3310

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
amagnolo opened this issue May 23, 2025 · 1 comment
Open

RetrievalAugmentationAdvisor appears to suppress tool usage #3310

amagnolo opened this issue May 23, 2025 · 1 comment

Comments

@amagnolo
Copy link

I’m implementing a RAG system with Spring AI that combines document-based context with dynamic, real-time data fetched via tools. However, I’ve encountered an issue where combining RetrievalAugmentationAdvisor with ToolCallbacks leads to significantly reduced tool usage by the model.

Use Case:

My application includes a knowledge base of static documents. These documents contain valuable information (e.g., the description of an employee’s role responsibilities and their name at the time of writing), but some data (e.g., the current employee's with that role name) must be retrieved at runtime via tools.

When I run ChatClient using only RetrievalAugmentationAdvisor or only ToolCallbacks, each works correctly. However, when both are enabled toghether, the LLM strongly prefers the retrieval context and rarely calls tools - but sometimes it does, so it is aware of them.

I attempted to mitigate this by adjusting the system prompt and modifying the ContextualQueryAugmenter’s prompt template to explicitly encourage tool usage over static context when appropriate, but this had no noticeable effect.

Models Tested:

  • gpt-4.1
  • gemini-2.5-flash-preview-04-17
  • gpt-4.1-mini (which seemed slightly more inclined to use tools)

Question/Issue:

I am not sure that this is a bug of the RetrievalAugmentationAdvisor implementation in Spring AI, or rather a limitation of used LLMs.

Is there any guidance on how can RetrievalAugmentationAdvisor be effectively complemented by tools providing fresher data?

@ThomasVitale
Copy link
Contributor

ThomasVitale commented May 29, 2025

If I understand correctly the use case, you would like to have a workflow where based on the question, the model should use a combination of RAG and tool calling. Is that right?

If it is, then I'd recommend adopting a routing approach so that the model decides dynamically where to fetch the context from among the available options.

One way to implement such architecture is Agentic RAG, where RAG flows are provided as tools next to other "regular" tools, leaving to the model the task of calling the ones that make sense based on the question.

You can find an example here: https://github.com/ThomasVitale/llm-apps-java-spring-ai/blob/main/rag/rag-conditional/src/main/java/com/thomasvitale/ai/spring/RagControllerQueryRouting.java I defined three tools: 2 for retrieving context from a vector store, and 1 for retrieving context from a web search. The model decides which tools to call based on the question.


When using the RetrievalAugmentationAdvisor, the RAG flow is always executed. The LLM doesn't have any "decision power" about it since it's something you enable explicitly from the application. That's by design. If you need to introduce conditional routes, then we need to adopt a pattern like the one I shared above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants