You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I fixed the performance problem for gguf tests. By modifying the llama-cpp-python and building a new API model, we only need to use about 1/10 time compared to previous methods.
For details, please refer to abetlen/llama-cpp-python#1983 and https://github.com/For-rest2005/lm-evaluation-harness
The API model is not well written and require the modification in llama-cpp-python. If needed, I can make a pull request.
This problem is solved with the help of @PolluxyShi
The text was updated successfully, but these errors were encountered:
I fixed the performance problem for gguf tests. By modifying the llama-cpp-python and building a new API model, we only need to use about 1/10 time compared to previous methods.
For details, please refer to abetlen/llama-cpp-python#1983 and https://github.com/For-rest2005/lm-evaluation-harness
The API model is not well written and require the modification in llama-cpp-python. If needed, I can make a pull request.
This problem is solved with the help of @PolluxyShi
The text was updated successfully, but these errors were encountered: