documentation/example for generate/eval/sample? #1111
Unanswered
jacekpoplawski
asked this question in
Q&A
Replies: 1 comment
-
Hey @jacekpoplawski to answer your question in a few parts: First, the reason you're getting that decode error is because generate doesn't automatically end when it reaches n_ctx, this is probably a bug really and should be handled but you can safely check Second, when to stop is really up to you, the model will usually output a eos (end-of-sentence) token that you can check throught Let me know if that answers your question. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Is there any example using functions generate(), eval() and sample()?
I tried to use following code from the docs:
tokens = llama.tokenize(b"Hello, world!")
for token in llama.generate(tokens, top_k=40, top_p=0.95, temp=1.0, repeat_penalty=1.1):
... print(llama.detokenize([token]))
unfortunately it produces lots of output and I am not sure when to stop
How can I set max tokens, like in usual call?
In my more complex code I usually see "RuntimeError: llama_decode returned 1" (both with generate and eval/sample)
What value should be passed to stopping_criteria? I tried:
stopping_criteria= ["### Instruction:", "### Response:"]
but it didn't work
Beta Was this translation helpful? Give feedback.
All reactions