Feature request: Option to disable auto adding BOS token (double BOS token) if it's already present/added. #917

Spacellary · 2024-06-11T22:57:28Z

How to disable this automatic behavior? And if it's not possible yet, can we get a --flag for it?

llama_tokenize_internal: Added a BOS token to the prompt as specified by the model but the prompt also starts with a BOS token.

Running into this with Llama-3-8B models.

Related PR:
ggml-org#7332

The text was updated successfully, but these errors were encountered:

LostRuins · 2024-06-13T05:26:59Z

How are you getting this? KoboldCpp automatically adds a BOS token at the start of the prompt, you don't have to add your own.

Spacellary · 2024-06-13T20:58:34Z

The model was converted to GGUF using the original configs from its own repo:

https://huggingface.co/Sao10K/L3-8B-Stheno-v3.2

https://huggingface.co/Sao10K/L3-8B-Stheno-v3.2/blob/main/tokenizer_config.json#L2052

Spacellary · 2024-06-13T21:11:59Z

Just so models don't have to be reconverted, after manually removing the bos_token from the tokenizer_config.json template, would it be possible to control/handle this behavior, regarding the automatic addition of the bos_token?

This does not help at all.

Similar situation from abetlen/llama-cpp-python.

Spacellary · 2024-06-14T02:04:43Z

@LostRuins – I should clarify for transparency/investigation:

How are you getting this? KoboldCpp automatically adds a BOS token at the start of the prompt, you don't have to add your own.

Using the latest release of KCPP as my inference backend on Windows 11, CuBLAS, connected to SillyTavern where I am interacting with the model/character.

If you believe this should be handled upstream, please let me know, honestly I'm not sure myself.

LostRuins · 2024-06-14T04:06:20Z

Hmm okay so the issue is, let's say I manually edit the prompt to remove the first BOS token if the user adds another one. What if they add 2 BOS tokens instead? Or what if they actually want to have 2,3, or more BOS tokens? Changing the BOS behavior based on what they send in the prompt seems kind of finnicky - either the backend should add a BOS automatically or it shouldn't at all - then the frontend can expect consistent behavior.

Fortunately, this doesn't actually seem to be an issue - having a double BOS in the prompt does not seem to negatively impact output quality at all, the first one is just ignored.

Spacellary · 2024-06-14T12:11:29Z

What if they add 2 BOS tokens instead? Or what if they actually want to have 2,3, or more BOS tokens?

This would be optional, of course. But I also agree that the outputs should be consistent for the sake of the frontends.

Fortunately, this doesn't actually seem to be an issue - having a double BOS in the prompt does not seem to negatively impact output quality at all, the first one is just ignored.

I was wondering about that since I didn't notice any issues other than the new warning as it was added upstream, but it makes people think something is wrong.

From the user side, it looks like either the model or the backend is doing something incorrectly. Considering that even after manually changing the models chat_template this warning still persists, I'm not sure.

Spacellary closed this as completed Jun 13, 2024

Spacellary reopened this Jun 13, 2024

Spacellary changed the title ~~Possible to disable auto adding BOS token (double BOS token)?~~ [Feature] Option to disable auto adding BOS token (double BOS token) if it's already present/added. Jun 14, 2024

Spacellary changed the title ~~[Feature] Option to disable auto adding BOS token (double BOS token) if it's already present/added.~~ Feature request: Option to disable auto adding BOS token (double BOS token) if it's already present/added. Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature request: Option to disable auto adding BOS token (double BOS token) if it's already present/added. #917

Feature request: Option to disable auto adding BOS token (double BOS token) if it's already present/added. #917

Spacellary commented Jun 11, 2024 •

edited

Loading

LostRuins commented Jun 13, 2024

Uh oh!

Spacellary commented Jun 13, 2024

Uh oh!

Spacellary commented Jun 13, 2024 •

edited

Loading

Uh oh!

Spacellary commented Jun 14, 2024 •

edited

Loading

Uh oh!

LostRuins commented Jun 14, 2024

Uh oh!

Spacellary commented Jun 14, 2024 •

edited

Loading

Uh oh!

Feature request: Option to disable auto adding BOS token (double BOS token) if it's already present/added. #917

Feature request: Option to disable auto adding BOS token (double BOS token) if it's already present/added. #917

Comments

Spacellary commented Jun 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

LostRuins commented Jun 13, 2024

Uh oh!

Spacellary commented Jun 13, 2024

Uh oh!

Spacellary commented Jun 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Spacellary commented Jun 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LostRuins commented Jun 14, 2024

Uh oh!

Spacellary commented Jun 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Spacellary commented Jun 11, 2024 •

edited

Loading

Spacellary commented Jun 13, 2024 •

edited

Loading

Spacellary commented Jun 14, 2024 •

edited

Loading

Spacellary commented Jun 14, 2024 •

edited

Loading