-
-
Notifications
You must be signed in to change notification settings - Fork 7.6k
[Misc] Update llama 3.2 template to support system prompt with images #10901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Misc] Update llama 3.2 template to support system prompt with images #10901
Conversation
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
Hi @tjohnson31415! I think the suggested template is not identical to the version we pushed to the Hub. Unless I'm mistaken, if the user provides images but no system role, the input prompt should not have a system prompt. This template, if I'm reading the diff correctly, would default to a system prompt in that situation. For example, given this input:
The desired prompt would be:
(I'm with HF and not a reviewer of this project. But I think it's important to ensure consistency in prompts across implementations, happy to revisit ours if it's in error). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this template with examples/openai_chat_completion_client_for_multimodal.py
and the prompt format looks good (there is no system prompt):
INFO: 127.0.0.1:42408 - "POST /v1/chat/completions HTTP/1.1" 200 OK
INFO 12-05 03:24:17 logger.py:37] Received request chatcmpl-384a0b3b15b84d7b8a9f8dc83827a827: prompt: "<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nWhat's in this image?<|image|><|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n", params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.7, top_p=1.0, top_k=-1, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=64, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None), prompt_token_ids: None, lora_request: None, prompt_adapter_request: None.
If I add system prompt "You are a helpful assistant." to messages, the request is:
INFO 12-05 03:26:58 logger.py:37] Received request chatcmpl-a3c07b7328674ab183377af2f8ce4999: prompt: "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nCutting Knowledge Date: December 2023\nToday Date: 05 Dec 2024\n\nYou are a helpful assistant.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nWhat's in this image?<|image|><|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n", params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.7, top_p=1.0, top_k=-1, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=64, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None), prompt_token_ids: None, lora_request: None, prompt_adapter_request: None.
Thanks @Isotr0py and @pcuenca! @pcuenca I agree that consistency with the chat templates would be great! I have found it difficult to find clarity on a "ground truth" for chat templates, particularly with tool calling involved. There are a couple of things that make the vLLM template different from the HF Hub one:
The system message will be non-empty either if a user supplies a system message or if tools are included with the request. |
Hi @tjohnson31415, thanks for taking the time to explain! I was misled by the empty string in line 41, thinking that it would lead to a false evaluation in the |
…vllm-project#10901) Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
…port system prompt with images (vllm-project#10901) Merge in GEN/vllm from tool-use-with-image-support-fix to feature-based-on-v0.6.4.post1.c78ab524 Squashed commit of the following: commit c78ab524b67cf0bd0ffd7dc11804c1d70682be93 Author: Travis Johnson <tsjohnso@us.ibm.com> Date: Wed Dec 4 22:54:06 2024 -0700 [Misc] Update llama 3.2 template to support system prompt with images (vllm-project#10901) Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Update the example chat template to support using a system prompt with images:
The coresponding change was recently made to the chat template in HF Hub:
https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct/discussions/84