Skip to content

Pull requests: pytorch-labs/tokenizers

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Fix cmake for regex lookahead CLA Signed This label is managed by the Meta Open Source bot.
#80 by jackzhxng was merged Jun 5, 2025 Loading…
Reland #66 and #67 CLA Signed This label is managed by the Meta Open Source bot.
#79 by jackzhxng was closed Jun 3, 2025 Loading…
Add regex unit tests and enable shared linkage in fbcode CLA Signed This label is managed by the Meta Open Source bot. fb-exported
#78 by larryliu0820 was merged May 29, 2025 Loading…
Use weak symbol create_fallback_regex to separate the implementation using PCRE2 and std::regex CLA Signed This label is managed by the Meta Open Source bot. fb-exported
#77 by larryliu0820 was merged May 22, 2025 Loading…
Add look ahead tiktoken target CLA Signed This label is managed by the Meta Open Source bot. fb-exported
#75 by larryliu0820 was merged May 16, 2025 Loading…
Reland #66 and #67 CLA Signed This label is managed by the Meta Open Source bot. fb-exported
#74 by jackzhxng was merged Jun 2, 2025 Loading…
Revert "Handle null bos and eos token" CLA Signed This label is managed by the Meta Open Source bot.
#73 by jackzhxng was merged May 13, 2025 Loading…
Revert "Fix tokenizer special token handling" CLA Signed This label is managed by the Meta Open Source bot.
#72 by jackzhxng was merged May 13, 2025 Loading…
Tokenizer clang format CLA Signed This label is managed by the Meta Open Source bot. fb-exported
#71 by jackzhxng was closed Jun 3, 2025 Loading…
Tokenizer clang format CLA Signed This label is managed by the Meta Open Source bot. fb-exported
#70 by jackzhxng was closed Jun 3, 2025 Loading…
Accept custom pattern string and special tokens CLA Signed This label is managed by the Meta Open Source bot. fb-exported
#69 by sxu was merged May 8, 2025 Loading…
Fix tokenizer special token handling CLA Signed This label is managed by the Meta Open Source bot.
#67 by jackzhxng was merged May 2, 2025 Loading…
Handle null bos and eos token CLA Signed This label is managed by the Meta Open Source bot.
#66 by jackzhxng was merged May 1, 2025 Loading…
Fix tokenizer special token handling CLA Signed This label is managed by the Meta Open Source bot.
#65 by jackzhxng was merged May 1, 2025 Loading…
Forward fix missing vtable for bpe tokenizer CLA Signed This label is managed by the Meta Open Source bot. fb-exported
#64 by jackzhxng was merged Apr 30, 2025 Loading…
Fix duplicate bpe tokenizer base symbol CLA Signed This label is managed by the Meta Open Source bot.
#63 by jackzhxng was merged Apr 29, 2025 Loading…
Add cmakelist for llama unicode CLA Signed This label is managed by the Meta Open Source bot.
#62 by jackzhxng was merged Apr 29, 2025 Loading…
Log hf tokenizer load failure to cerr instead of cout CLA Signed This label is managed by the Meta Open Source bot.
#61 by jackzhxng was merged Apr 25, 2025 Loading…
Fix pcre2 target CLA Signed This label is managed by the Meta Open Source bot.
#60 by jackzhxng was merged Apr 25, 2025 Loading…
Gate regex lookahead in cmake behind compile flag CLA Signed This label is managed by the Meta Open Source bot. fb-exported
#59 by jackzhxng was merged Apr 24, 2025 Loading…
Pcre2 buck target in third-party (#55) CLA Signed This label is managed by the Meta Open Source bot.
#58 by jackzhxng was merged Apr 23, 2025 Loading…
Option to optimize tokenizer without pcre2 CLA Signed This label is managed by the Meta Open Source bot. fb-exported
#56 by jackzhxng was closed Apr 24, 2025 Loading…
Pcre2 buck target in third-party CLA Signed This label is managed by the Meta Open Source bot. fb-exported
#55 by jackzhxng was closed Apr 23, 2025 Loading…
Initialize bos_tok_ = 0 in tokenizer.h CLA Signed This label is managed by the Meta Open Source bot.
#54 by kirklandsign was merged Apr 21, 2025 Loading…
ProTip! Follow long discussions with comments:>50.