Skip to content

Move grammar support out of examples? Unify? #1930

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
josharian opened this issue Mar 5, 2024 · 3 comments
Open

Move grammar support out of examples? Unify? #1930

josharian opened this issue Mar 5, 2024 · 3 comments

Comments

@josharian
Copy link
Contributor

josharian commented Mar 5, 2024

I'd like to add Go binding support for grammars (for #1697). It's a bit inconvenient now, because the grammar support is off in an examples directory, and adding stuff from examples to libwhisper.a feels wrong; ditto for the header file.

GBNF seems pretty well established at this point (it is in common in llama.cpp). It'd be nice to make it easier to support.

Could we promote grammar support to core whisper.h/whisper.cpp, similar to llama.cpp, where you simply provide a grammar string?

@ggerganov
Copy link
Member

Yes, we should do that. I'm also thinking about moving all the grammar stuff into the ggml core library so that it becomes available everywhere. But the main problem is re-implementing the C++ stuff in C

@josharian
Copy link
Contributor Author

josharian commented Mar 7, 2024

the main problem is re-implementing the C++ stuff in C

Another option is to hide it all behind a very simple C facade, something like new/init, parse, free. (At least, that would suffice for Go bindings.)

Btw, I've noticed some crashes in the GBNF parser. I am planning to set up some fuzzing for it soon to try to shake them out.

@philipag
Copy link

I have a basic implementation at #2838
If someone can add this to the build it should work with minor boiler plate changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants