You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm currently using the experimental Gemini Embeddings model via the generic OpenAI API in AnythingLLM Desktop. The model has a rate limit of 5/10 RPM. Even after setting the max concurrent chunks to 1, I sometimes still encounter error 429 when uploading some PDF files (sent one by one every few minutes, with each file being less than 1MB). The same files can be embedded smoothly using Gemini API text-embedding-004.
[429 RESOURCE_EXHAUSTED You've exceeded the rate limit. You are sending too many requests per minute with the free tier Gemini API. Ensure you're within the model's rate limit. Request a quota increase if needed]
Would it be possible to add a configuration option that allows users to set a delay between embedding requests?
The text was updated successfully, but these errors were encountered:
What would you like to see?
I'm currently using the experimental Gemini Embeddings model via the generic OpenAI API in AnythingLLM Desktop. The model has a rate limit of 5/10 RPM. Even after setting the max concurrent chunks to 1, I sometimes still encounter error 429 when uploading some PDF files (sent one by one every few minutes, with each file being less than 1MB). The same files can be embedded smoothly using Gemini API text-embedding-004.
[429 RESOURCE_EXHAUSTED You've exceeded the rate limit. You are sending too many requests per minute with the free tier Gemini API. Ensure you're within the model's rate limit. Request a quota increase if needed]
Would it be possible to add a configuration option that allows users to set a delay between embedding requests?
The text was updated successfully, but these errors were encountered: