[Misc] Clean up input processing #17582

DarkLight1337 · 2025-05-02T10:37:38Z

Following #17045 and #15428, the preprocessing logic has gotten rather complicated. This PR cleans that up by narrowing the prompt types early and handling them separately so that it becomes easier to keep track how each prompt type is being processed .

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

github-actions · 2025-05-02T10:37:46Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

DarkLight1337 · 2025-05-02T10:39:04Z

vllm/engine/llm_engine.py

-    def _validate_token_prompt(self, prompt: PromptType,
-                               tokenizer: AnyTokenizer):
-        # Guard against out-of-vocab tokens.
-        # For some tokenizers, tokenizer.decode will happily return empty text
-        # for token ids that are out of vocab, and we don't detect token ids
-        # that are greater than the max token id before running the model.
-        # However, these token ids will later crash a cuda kernel at runtime
-        # with an index out of bounds error. This will crash the entire engine.
-        # This needs to happen before multimodal input pre-processing, which
-        # may add dummy <image> tokens that aren't part of the tokenizer's
-        # vocabulary.
-        if is_token_prompt(prompt):
-            prompt_ids = prompt["prompt_token_ids"]
-            if len(prompt_ids) == 0:
-                # Empty prompt check is handled later
-                return
-            max_input_id = max(prompt_ids)
-            if max_input_id > tokenizer.max_token_id:
-                raise ValueError(
-                    "Token id {} is out of vocabulary".format(max_input_id))
-


#11980 lets us move this to after processor is applied, making V0 validation (_validate_model_inputs) same as V1.

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

DarkLight1337 · 2025-05-02T10:58:04Z

vllm/inputs/data.py

+    cache_salt: NotRequired[str]
+    """
+    Optional cache salt to be used for prefix caching.
+    """
+


Although we don't support embedding inputs in V1 yet, I added this so we don't forget to assign cache_salt during processing when we do implement it

Isotr0py

Overall looks reasonable to me!

vllm/inputs/preprocess.py

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

[Misc] Clean up input processing

7041855

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label May 2, 2025

DarkLight1337 requested review from ywang96, zhuohan123, youkaichao, alexm-redhat, comaniac and njhill as code owners May 2, 2025 10:37

mergify bot added frontend multi-modality Related to multi-modality (#4194) labels May 2, 2025

DarkLight1337 commented May 2, 2025

View reviewed changes

DarkLight1337 requested a review from Isotr0py May 2, 2025 10:39

Keep mypy happy

a994751

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

DarkLight1337 requested review from WoosukKwon and robertgshaw2-redhat as code owners May 2, 2025 10:56

mergify bot added the v1 label May 2, 2025

DarkLight1337 commented May 2, 2025

View reviewed changes

Isotr0py approved these changes May 2, 2025

View reviewed changes

vllm/inputs/preprocess.py Outdated Show resolved Hide resolved

DarkLight1337 added 5 commits May 2, 2025 12:19

Address comment

aed507e

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Reword

26eab46

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Fix mypy

42a4c8d

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Fix mypy

39a53d8

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Update

debac82

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

vllm-bot merged commit cb23495 into vllm-project:main May 2, 2025
10 of 14 checks passed

DarkLight1337 deleted the cleanup-processing branch May 2, 2025 15:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Misc] Clean up input processing #17582

[Misc] Clean up input processing #17582

DarkLight1337 commented May 2, 2025 •

edited

Loading

github-actions bot commented May 2, 2025

DarkLight1337 May 2, 2025 •

edited

Loading

DarkLight1337 May 2, 2025 •

edited

Loading

Isotr0py left a comment

[Misc] Clean up input processing #17582

[Misc] Clean up input processing #17582

Conversation

DarkLight1337 commented May 2, 2025 • edited Loading

github-actions bot commented May 2, 2025

DarkLight1337 May 2, 2025 • edited Loading

Choose a reason for hiding this comment

DarkLight1337 May 2, 2025 • edited Loading

Choose a reason for hiding this comment

Isotr0py left a comment

Choose a reason for hiding this comment

DarkLight1337 commented May 2, 2025 •

edited

Loading

DarkLight1337 May 2, 2025 •

edited

Loading

DarkLight1337 May 2, 2025 •

edited

Loading