Skip to content

Feat: Support passing knowledge base id as variable in retrieval component #7088

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 30, 2025

Conversation

notsyncing
Copy link
Contributor

What problem does this PR solve?

Fix #6600

Hello, I have the same business requirement as #6600. My use case is:

We have many departments (> 20 now and increasing), and each department has its own knowledge base. Because the agent workflow is the same, so I want to change the knowledge base on the fly, instead of creating agents for every department.

It now looks like this:

屏幕截图_20250416_212622

Knowledge bases can be selected from the dropdown, and passed through the variables in the table. All selected knowledge bases are used for retrieval.

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. 💞 feature Feature request, pull request that fullfill a new feature. labels Apr 16, 2025
@KevinHuSh KevinHuSh requested a review from asiroliu April 17, 2025 01:21
@asiroliu
Copy link
Contributor

@KevinHuSh
Successfully validated all Agent Templates in this version - all templates are functioning normally with proper conversational responses.

@wsxiaotiana
Copy link

@notsyncing @KevinHuSh Hi, I really appreciate your team's hard work. But this feature hasn't been merged after two weeks of validation. Could you share the timeline? Thanks a lot.

@asiroliu
Copy link
Contributor

@notsyncing
I've performed the following validation steps using your latest commit

  1. Successfully built Docker image from latest commit (743751d)
  2. Created test knowledge bases:
  • Cash Management
  • The Chu Shi Biao
image
  1. Initialized simple agent
image
  1. Retrieved KB ID for "The Chu Shi Biao": 7fd6509024ad11f0ae5f1e318b07bb72
image
  1. Test question about Cash Management
image
  1. Test question about The Chu Shi Biao
image

@asiroliu
Copy link
Contributor

@notsyncing
Actually, KB names are already unique in our system (db_models.py#L591)
image

Could we consider using KB names directly instead?

@notsyncing
Copy link
Contributor Author

Could we consider using KB names directly instead?

In our use case, we are passing KB ids from another program, so we don't need KB names. But I think it's better to add a knowledge base type input to the Begin component, and reuse the existing knowledge base selector for that type, so we can simply select KBs from the debug input. What do you think about this?

@notsyncing
Copy link
Contributor Author

Actually, KB names are already unique in our system (db_models.py#L591)

Also, the index=True does not seem to mean unique=True, according to peewee document https://docs.peewee-orm.com/en/latest/peewee/models.html

@asiroliu
Copy link
Contributor

You're right. The KB name uniqueness isn't enforced by index=True here—it's maintained by code like dataset.py#L139
image

@notsyncing
Copy link
Contributor Author

You're right. The KB name uniqueness isn't enforced by index=True here—it's maintained by code like [dataset.py#L139]

Okay. Even you have enforced this, I still advice using id instead of name here:

  1. When using ragflow as a backend service (like our use case), we usually prefer an id instead of a name, because name may change, or may cause some encoding issues with some non-UTF8 client.
  2. When providing ragflow's frontend to user, we can use a selector (like the kb selector in the Retrieval component) for this, to keep a user-friendly behavior, because user may enter a wrong name and was not aware of it until an error occurs. The selector still gives id underneath.

@KevinHuSh KevinHuSh merged commit 6e7dd54 into infiniflow:main Apr 30, 2025
4 checks passed
KevinHuSh pushed a commit that referenced this pull request May 6, 2025
### What problem does this PR solve?

This is a follow-up of #7088 , adding a knowledge base type input to the
`Begin` component, and a knowledge base selector to the agent flow debug
input panel:


![image](https://github.com/user-attachments/assets/e4cd35f1-1c8e-4f69-bed4-5d613b96d148)

then you can select one or more knowledge bases when testing the agent:


![image](https://github.com/user-attachments/assets/724b547e-4790-4cd8-83d3-67e02f2e76d8)

Note: the lines changed in `agent/component/retrieval.py` after line 94
are modified by `ruff format` from the `pre-commit` hooks, no functional
change.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💞 feature Feature request, pull request that fullfill a new feature. size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request]: Support knowledge base ids as params for retrieval component in agent flow
4 participants