documentation/example for generate/eval/sample? #1111

jacekpoplawski · 2024-01-20T10:34:04Z

jacekpoplawski
Jan 20, 2024

Is there any example using functions generate(), eval() and sample()?

I tried to use following code from the docs:

tokens = llama.tokenize(b"Hello, world!")
for token in llama.generate(tokens, top_k=40, top_p=0.95, temp=1.0, repeat_penalty=1.1):
... print(llama.detokenize([token]))

unfortunately it produces lots of output and I am not sure when to stop

How can I set max tokens, like in usual call?

In my more complex code I usually see "RuntimeError: llama_decode returned 1" (both with generate and eval/sample)

What value should be passed to stopping_criteria? I tried:

stopping_criteria= ["### Instruction:", "### Response:"]

but it didn't work

abetlen · 2024-02-07T04:55:50Z

abetlen
Feb 7, 2024
Maintainer

Hey @jacekpoplawski to answer your question in a few parts:

First, the reason you're getting that decode error is because generate doesn't automatically end when it reaches n_ctx, this is probably a bug really and should be handled but you can safely check llama.n_tokens and make sure it's lower than n_ctx.

Second, when to stop is really up to you, the model will usually output a eos (end-of-sentence) token that you can check throught token == llama.token_eos().

Let me know if that answers your question.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

documentation/example for generate/eval/sample? #1111

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

documentation/example for generate/eval/sample? #1111

Uh oh!

Uh oh!

jacekpoplawski Jan 20, 2024

Replies: 1 comment

Uh oh!

abetlen Feb 7, 2024 Maintainer

jacekpoplawski
Jan 20, 2024

abetlen
Feb 7, 2024
Maintainer