-
Hi, I'm planning to employ in-context learning in my project and have chosen to use greedy decoding. Unlike in HuggingFace, it seems there is no param = {
"n_predict": 256,
"stop": ["\n\n"],
"prompt": prompt,
"temperature": 0.0,
"top_k": 0,
"top_p": 0.0,
"repeat_last_n": 0,
"repeat_penalty": 1.0,
"penalize_nl": False,
"tfs_z": 1.0,
"presence_penalty": 0.0,
"frequency_penalty": 0.0,
"mirostat": 0
} I've chosen these parameters based on the server documentation since I'm using the server to serve as an LLM backend. Can anyone provide feedback on this? Specifically, I'm wondering if I've missed something or if there are better values for certain parameters given my intended use-case. Thank you in advance! |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 11 replies
-
At least in the |
Beta Was this translation helpful? Give feedback.
-
hi. can you ask questions continued on what you've discussed? I use
to be my request. but every time the response turn out different, which might suggest that the server was not doing greedy decoding. What did i do wrong? Thank you~ |
Beta Was this translation helpful? Give feedback.
-
This means that if I want to use greedy decoding, I just need to set temp=0. |
Beta Was this translation helpful? Give feedback.
-
Setting
|
Beta Was this translation helpful? Give feedback.
-
Hey 😊@PenutChen! Your setup looks solid for fully greedy decoding! Setting temperature=0.0, top_k=0, and top_p=0.0 ensures full determinism, so you’re on the right track. repeat_last_n=0 makes sure the model strictly follows the prompt, which is great. One small tweak—you might try repeat_penalty=1.1 instead of 1.0 to help avoid repetitive loops. If things feel off, playing around with presence_penalty or frequency_penalty a little could help. But overall, this should work well for in-context learning. 😊 |
Beta Was this translation helpful? Give feedback.
Setting
temp = 0
will no longer be equivalent to greedy decoding (see #9897). To enable it, configure a singletop_k
sampler and setk = 1
. For example, withllama-cli
this can be done with the following CLI args: