-
Notifications
You must be signed in to change notification settings - Fork 15
docs: update README on preparing quantized model #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Correct some some outdated references to files within the llama.cpp repo and update the example to use a smaller model.
@yoshoku I don't understand your commitlint. I get the line length, but it's referring to some |
@krschacht Thank you for your contribution. llama_cpp.rb adopts conventional commits: https://www.conventionalcommits.org/en/v1.0.0/ For example, the commit message for this change might be |
@yoshoku Ah, got it. Somehow I never knew about conventional commits! TIL :) I just updated the PR so hopefully it's ready to go. BTW, do you have other plans for this project? I was very excited to find this. I found it while looking for the ruby equivalent of Python's The one issue I ran into with your project was when I tried to load a model like |
@krschacht I wanted you to fix the git commit message, not the pull request description. But, I understood the gist of the pull request, so I fixed README with you as a co-author 8edfd6d. I am going to close this pull request, but please do not take it personally. |
@yoshoku I don't mind at all! Linters... I'm not looking for points. :) I also run a project and I jump in to help get PRs over the line all the time. It's often easier. I really am interested in if you have other plans for this project? I was thinking about trying to create a version of what you did but for CTranslate2. But maybe I'm wrong and your llama_cpp bindings can also work for T5 models? Anyway, curious where you plan to take your project |
@krschacht I think it would be a good idea to create bindings for CTranslate2, but I am pretty busy these days so I probably will not have time to do it. |
I didn't realize llama_cpp had recently added support for T5. I found an interesting thread talking about the performance of CTranslate2 vs llama_cpp and it sounds like some of the performance optimizations within CTranslate2 are already in llama_cpp, so the performance may not be that different. Anyway, I appreciate you creating this project and for sharing this insight. Consider me a motivated & interested "user" in case you ever need help with testing, bugs, implementing specific pieces, etc. I'll keep playing with llama_cpp.rb now that I have it all working! |
docs: update description of preparing quantized model in usage section.