Replies: 2 comments
-
I agree with the sentiment, I think it should evolve over time into being more general. It would allow this project to stay relevant and really scale up in a way everyone would benefit from I think. Ideally we all work on the same core components (like llama.cpp) so we don't repeat work, and can collectively spend more time doing the cutting edge research and productizing that. |
Beta Was this translation helpful? Give feedback.
-
Yes, I am thinking of better ways how to make the work in this repo be more general and apply to other models. However, at the moment I really want to focus at finalizing the quantization algorithm - it's very important to have the optimal way to quantize models as this is the most valuable feature of this implementation. Before we finishing it, I don't want to start moving stuff around as it will be very difficult to handle |
Beta Was this translation helpful? Give feedback.
-
In the long run llama is a dead end. It's on a toxic Facebook license and the larger variants are unlicensed (illegal).
Though the work done in here is awesome..
I wonder how well Cerebras-13B (Apache-2 license!) competes against llama-13B (toxic license) when fine tuned.
I've little doubts that we'll see more foundation models pop up in the near future, maybe a Cerebras 40B or likely releases from Musk once his new company starts putting their new A100's together.
I'd guess 99% of all attention for ggml LLM development is on llama.cpp, cerebras just sits in the GPT-2 "example" directory despite being a huge model with a open Apache-2 license.
Wouldn't it make sense to migrate this project into a general "ggllm.cpp" repository that supports all GPT-like models ?
The eval() code could be taken from a /plugins/ directory where we'd have a llama.cpp and llama.h
So a new model would just require a plugin addition.
Beta Was this translation helpful? Give feedback.
All reactions