exllamav2 and IBM Granite models #460
Closed
RaRasputinRGLM
started this conversation in
General
Replies: 1 comment 6 replies
-
I'm unsure what the issue is? I added support for Granite two weeks ago. It indeed doesn't have an |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I wanted to work with these because the sizes are appealing for local LLMs, they all use the starcoder tokenizer, however the model.embed_tokens is not found in any of the safe tensors, I was thinking, that I could hack a solution out with extracting the safetensor piece from the starcoder, but I think there might be a faster way to go that I am missing due to inexperience.
I don't think this is really a exllamav2 issue but worth discussing. Here is a similar issue with IBM granite that can give some insight about how they structured their models, over at llama.cpp ggml-org/llama.cpp#7116
In short what is the right route with exllamav2 and missing embed_tokens if you know the tokenizer it uses?
Beta Was this translation helpful? Give feedback.
All reactions