Skip to content

[FEAT]: Configurable Delay Between Embedding Requests #3570

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
RuiU opened this issue Apr 1, 2025 · 1 comment
Open

[FEAT]: Configurable Delay Between Embedding Requests #3570

RuiU opened this issue Apr 1, 2025 · 1 comment
Labels
enhancement New feature or request feature request

Comments

@RuiU
Copy link

RuiU commented Apr 1, 2025

What would you like to see?

I'm currently using the experimental Gemini Embeddings model via the generic OpenAI API in AnythingLLM Desktop. The model has a rate limit of 5/10 RPM. Even after setting the max concurrent chunks to 1, I sometimes still encounter error 429 when uploading some PDF files (sent one by one every few minutes, with each file being less than 1MB). The same files can be embedded smoothly using Gemini API text-embedding-004.

[429 RESOURCE_EXHAUSTED You've exceeded the rate limit. You are sending too many requests per minute with the free tier Gemini API. Ensure you're within the model's rate limit. Request a quota increase if needed]

Would it be possible to add a configuration option that allows users to set a delay between embedding requests?

@RuiU RuiU added enhancement New feature or request feature request labels Apr 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request feature request
Projects
None yet
Development

No branches or pull requests

2 participants
@RuiU and others