-
Notifications
You must be signed in to change notification settings - Fork 1.5k
RetrievalAugmentationAdvisor appears to suppress tool usage #3310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
If I understand correctly the use case, you would like to have a workflow where based on the question, the model should use a combination of RAG and tool calling. Is that right? If it is, then I'd recommend adopting a routing approach so that the model decides dynamically where to fetch the context from among the available options. One way to implement such architecture is Agentic RAG, where RAG flows are provided as tools next to other "regular" tools, leaving to the model the task of calling the ones that make sense based on the question. You can find an example here: https://github.com/ThomasVitale/llm-apps-java-spring-ai/blob/main/rag/rag-conditional/src/main/java/com/thomasvitale/ai/spring/RagControllerQueryRouting.java I defined three tools: 2 for retrieving context from a vector store, and 1 for retrieving context from a web search. The model decides which tools to call based on the question. When using the |
You're right, the use case I'm considering needs a combination of RAG and tool calling, but they often need to be used together and not alternatively. The ideal LLM behavior would be:
Sometimes it does exactly that. Unfortunately, most times the model skips the tool call, resulting in stale contact names. Although my prompt and tool description explicitly instruct it to prioritize real-time data, the LLM only follows that guidance when it actually has decided to call the tool, instead of calling the tool whenever updated information is needed. If I understand correctly your proposed solution, it uses either the tool or the RAG, but in this case none of them could give a complete answer (i.e. both the procedure and the current contact name). As a workaround, I am considering a post-processing step: after generating the RAG-based answer, run a second LLM pass to detect any contact references and, if found, invoke the tool to replace stale names with current ones in the response. Do you think this approach is feasible? Or is there a better pattern to ensure that both document-based context and fresh tool data are combined effectively? Ideally, I'd prefer to improve |
I managed to improve my implementation by registering the document retriever itself as a tool alongside the others (similar to your example, but without Ultimately, the recommendation seems to be not to use |
I’m implementing a RAG system with Spring AI that combines document-based context with dynamic, real-time data fetched via tools. However, I’ve encountered an issue where combining
RetrievalAugmentationAdvisor
withToolCallbacks
leads to significantly reduced tool usage by the model.Use Case:
My application includes a knowledge base of static documents. These documents contain valuable information (e.g., the description of an employee’s role responsibilities and their name at the time of writing), but some data (e.g., the current employee's with that role name) must be retrieved at runtime via tools.
When I run
ChatClient
using onlyRetrievalAugmentationAdvisor
or onlyToolCallbacks
, each works correctly. However, when both are enabled toghether, the LLM strongly prefers the retrieval context and rarely calls tools - but sometimes it does, so it is aware of them.I attempted to mitigate this by adjusting the system prompt and modifying the
ContextualQueryAugmenter
’s prompt template to explicitly encourage tool usage over static context when appropriate, but this had no noticeable effect.Models Tested:
gpt-4.1
gemini-2.5-flash-preview-04-17
gpt-4.1-mini
(which seemed slightly more inclined to use tools)Question/Issue:
I am not sure that this is a bug of the
RetrievalAugmentationAdvisor
implementation in Spring AI, or rather a limitation of used LLMs.Is there any guidance on how can
RetrievalAugmentationAdvisor
be effectively complemented by tools providing fresher data?The text was updated successfully, but these errors were encountered: