-
Notifications
You must be signed in to change notification settings - Fork 523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
minor fix in mixtral_moe.py for apple silicon #170
Conversation
There are two fixes that address errors encountered when ran on Apple M2 macbook with Rosetta enabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR - I appreciate it! Put some minor comments up.
mergekit/scripts/mixtral_moe.py
Outdated
@@ -113,7 +113,7 @@ def tokenize_prompts( | |||
prompts: List[str], tokenizer: transformers.PreTrainedTokenizerBase | |||
): | |||
return tokenizer( | |||
[tokenizer.bos_token + p for p in prompts], | |||
[str(tokenizer.bos_token) + p for p in prompts], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably would be better to do tokenizer.bos_token or ""
here - otherwise it'll be prepending the string "None" to your prompts if the BOS token isn't set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok done
mergekit/scripts/mixtral_moe.py
Outdated
@@ -366,6 +366,7 @@ def build( | |||
) | |||
tokenizer.padding_side = "left" | |||
tokenizer.pad_token_id = tokenizer.bos_token_id | |||
tokenizer.pad_token = tokenizer.eos_token |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be great to make this conditional - if the pad token is already set it'd be good to not overwrite it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
address review comments
There are two fixes that address errors encountered when ran on Apple M2 macbook with Rosetta enabled, not sure if it is a universal problem or specific to my environment (therefore I didn't file an issue due to this uncertainty).
Fix 1 : cast str type for tokenizer.bos_token, this aims for a fix of persistent weird error
"typeError: unsupported operand type(s) for +: 'NoneType' and 'str'"
although bos_token is indeed str type.Fix 2 : add pad_token value, per suggestion of
padding_strategy
raisedValueError "Asking to pad but the tokenizer does not have a padding token. "
(Line 2708 in transformers/tokenization_utils_base.py)