-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU support #128
Comments
Here's hoping. Consumers would only be able to use 8/12 GB's of VRAM though. |
This would be something to ask the llama.cpp team, until they don't add support for GPU's we can't add it to Serge. |
It seems it's now officially supported ggml-org/llama.cpp#1827 |
That's a draft, it's not merged/released yet |
I'm not familiar with this project but I saw this issue mentioned in the llama.cpp pull request. llama.cpp already has cuBLAS and CLBlast support in the master branch. The PR mentioned above improves performance for GPUs with CUDA. Check out the docs where they mention building with LLAMA_OPENBLAS and LLAMA_CUBLAS enabled for more information. |
Seems this issue should be open |
Duplicated of #43 |
It’s possible to make the webui and the container work with a version of llama that use gpu ?
The text was updated successfully, but these errors were encountered: