[Misc] Update llama 3.2 template to support system prompt with images #10901

tjohnson31415 · 2024-12-04T17:33:55Z

Update the example chat template to support using a system prompt with images:

remove the exception raised if sending a system message and images
when prompting with an image, render the system message if the user supplied a prompt or if tools are requested

The coresponding change was recently made to the chat template in HF Hub:
https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct/discussions/84

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

github-actions · 2024-12-04T17:34:07Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

pcuenca · 2024-12-04T20:10:18Z

Hi @tjohnson31415! I think the suggested template is not identical to the version we pushed to the Hub. Unless I'm mistaken, if the user provides images but no system role, the input prompt should not have a system prompt. This template, if I'm reading the diff correctly, would default to a system prompt in that situation.

For example, given this input:

messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": "If I had to write a haiku for this one, it would be: "}
    ]}
]

The desired prompt would be:

<|begin_of_text|><|start_header_id|>user<|end_header_id|>

<|image|>If I had to write a haiku for this one, it would be: <|eot_id|>

(I'm with HF and not a reviewer of this project. But I think it's important to ensure consistency in prompts across implementations, happy to revisit ours if it's in error).

Isotr0py

I tested this template with examples/openai_chat_completion_client_for_multimodal.py and the prompt format looks good (there is no system prompt):

INFO:     127.0.0.1:42408 - "POST /v1/chat/completions HTTP/1.1" 200 OK
INFO 12-05 03:24:17 logger.py:37] Received request chatcmpl-384a0b3b15b84d7b8a9f8dc83827a827: prompt: "<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nWhat's in this image?<|image|><|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n", params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.7, top_p=1.0, top_k=-1, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=64, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None), prompt_token_ids: None, lora_request: None, prompt_adapter_request: None.

If I add system prompt "You are a helpful assistant." to messages, the request is:

INFO 12-05 03:26:58 logger.py:37] Received request chatcmpl-a3c07b7328674ab183377af2f8ce4999: prompt: "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nCutting Knowledge Date: December 2023\nToday Date: 05 Dec 2024\n\nYou are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nWhat's in this image?<|image|><|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n", params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.7, top_p=1.0, top_k=-1, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=64, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None), prompt_token_ids: None, lora_request: None, prompt_adapter_request: None.

tjohnson31415 · 2024-12-05T08:01:10Z

Thanks @Isotr0py and @pcuenca!

@pcuenca I agree that consistency with the chat templates would be great! I have found it difficult to find clarity on a "ground truth" for chat templates, particularly with tool calling involved.

There are a couple of things that make the vLLM template different from the HF Hub one:

Handling content that is a string vs openai-style list of objects. vLLM will transform the input messages (for all roles) depending on the --chat-template-content-format configuration, or will auto-detect from the chat template. I decided to just support both formats 😅
There is a default system prompt message for tool use that was developed in the PR that first added these example templates. This default prompt is why I did not include user_supplied_system_message from the HF Hub template and wrote the check as

{%- if system_message or not image_ns.has_images %}

The system message will be non-empty either if a user supplies a system message or if tools are included with the request.

pcuenca · 2024-12-05T08:20:25Z

Hi @tjohnson31415, thanks for taking the time to explain! I was misled by the empty string in line 41, thinking that it would lead to a false evaluation in the if below. Perhaps we can also simplify our own template.

…vllm-project#10901) Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

…port system prompt with images (vllm-project#10901) Merge in GEN/vllm from tool-use-with-image-support-fix to feature-based-on-v0.6.4.post1.c78ab524 Squashed commit of the following: commit c78ab524b67cf0bd0ffd7dc11804c1d70682be93 Author: Travis Johnson <tsjohnso@us.ibm.com> Date: Wed Dec 4 22:54:06 2024 -0700 [Misc] Update llama 3.2 template to support system prompt with images (vllm-project#10901) Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

update llama 3.2 template to support system prompt with images

f5090bc

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

Isotr0py approved these changes Dec 5, 2024

View reviewed changes

Isotr0py added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 5, 2024

Isotr0py enabled auto-merge (squash) December 5, 2024 03:46

Isotr0py merged commit 39c89e7 into vllm-project:main Dec 5, 2024
45 checks passed

tjohnson31415 deleted the llama-vision-template-update branch December 5, 2024 07:22

DarkLight1337 mentioned this pull request Dec 13, 2024

[Bug]: Llama-3.2-11B-Vision-Instruct server crashes when adding system prompt #11158

Closed

1 task

sleepwalker2017 pushed a commit to sleepwalker2017/vllm that referenced this pull request Dec 13, 2024

[Misc] Update llama 3.2 template to support system prompt with images (…

29d9a99

…vllm-project#10901) Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Misc] Update llama 3.2 template to support system prompt with images #10901

[Misc] Update llama 3.2 template to support system prompt with images #10901

tjohnson31415 commented Dec 4, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Dec 4, 2024

pcuenca commented Dec 4, 2024 •

edited

Loading

Isotr0py left a comment •

edited

Loading

tjohnson31415 commented Dec 5, 2024 •

edited

Loading

pcuenca commented Dec 5, 2024

[Misc] Update llama 3.2 template to support system prompt with images #10901

[Misc] Update llama 3.2 template to support system prompt with images #10901

Conversation

tjohnson31415 commented Dec 4, 2024 • edited by github-actions bot Loading

github-actions bot commented Dec 4, 2024

pcuenca commented Dec 4, 2024 • edited Loading

Isotr0py left a comment • edited Loading

Choose a reason for hiding this comment

tjohnson31415 commented Dec 5, 2024 • edited Loading

pcuenca commented Dec 5, 2024

tjohnson31415 commented Dec 4, 2024 •

edited by github-actions bot

Loading

pcuenca commented Dec 4, 2024 •

edited

Loading

Isotr0py left a comment •

edited

Loading

tjohnson31415 commented Dec 5, 2024 •

edited

Loading