[Misc] Consolidate Audio tests into multimodal common generation tests #18214

Isotr0py · 2025-05-15T16:30:02Z

Split from [Misc] Refactor VLM common generation tests to support audio inputs and mix-modality tests #17633, so that we can start audio tests consolidation.

Signed-off-by: Isotr0py <2037008807@qq.com>

github-actions · 2025-05-15T16:30:13Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Isotr0py · 2025-05-15T16:33:31Z

tests/models/multimodal/generation/test_common.py

@@ -158,6 +158,17 @@
        image_size_factors=[(), (0.25,), (0.25, 0.25, 0.25), (0.25, 0.2, 0.15)],
        marks=[pytest.mark.core_model, pytest.mark.cpu_model],
    ),
+    "ultravox": VLMTestInfo(
+        models = ["fixie-ai/ultravox-v0_5-llama-3_2-1b"],
+        test_type=VLMTestType.AUDIO,


Since ultravox's hf runner doesn't support multi-audios input, I will add multi-audio test type with missing Qwen2-Audio test together in a following PR.

DarkLight1337

Looks reasonable, should be good as long as tests pass. Thanks for working on this!

Isotr0py added 21 commits May 2, 2025 17:41

refactor vlm test runner to support audio input

12ba7b4

Signed-off-by: Isotr0py <2037008807@qq.com>

add mixed modality input

c0486f6

Signed-off-by: Isotr0py <2037008807@qq.com>

fix model loading

a1f9ef8

Signed-off-by: Isotr0py <2037008807@qq.com>

Merge remote-tracking branch 'upstream/main' into qwen25-omni-test

6e5aaf2

Signed-off-by: Isotr0py <2037008807@qq.com>

fix tests

f5b23c4

Signed-off-by: Isotr0py <2037008807@qq.com>

fix aspect ratio tests

f1a06a0

Signed-off-by: Isotr0py <2037008807@qq.com>

add helper class

c43e3ce

Signed-off-by: Isotr0py <2037008807@qq.com>

rename

ee1d37b

Signed-off-by: Isotr0py <2037008807@qq.com>

make mypy happy

a04edd0

Signed-off-by: Isotr0py <2037008807@qq.com>

expose video data

4d996c1

Signed-off-by: Isotr0py <2037008807@qq.com>

remove runner mm_key

00ca64d

Signed-off-by: Isotr0py <2037008807@qq.com>

add video to mix modality test

9a55fa5

Signed-off-by: Isotr0py <2037008807@qq.com>

Merge remote-tracking branch 'origin/qwen25-omni-test' into audio-test

cf27d8e

Signed-off-by: Isotr0py <2037008807@qq.com>

integrate audio tests

36903f6

Signed-off-by: Isotr0py <2037008807@qq.com>

add audio test

8a5981a

Signed-off-by: Isotr0py <2037008807@qq.com>

clean up ultravox test

63d23b9

Signed-off-by: Isotr0py <2037008807@qq.com>

align vllm and hf outputs

c42edb4

Signed-off-by: Isotr0py <2037008807@qq.com>

fix audio prompt building

b1b5251

Signed-off-by: Isotr0py <2037008807@qq.com>

code format

b01d02b

Signed-off-by: Isotr0py <2037008807@qq.com>

fix types

c042ccf

Signed-off-by: Isotr0py <2037008807@qq.com>

correct annotations

bccb7b4

Signed-off-by: Isotr0py <2037008807@qq.com>

Isotr0py requested review from DarkLight1337 and ywang96 as code owners May 15, 2025 16:30

mergify bot added the multi-modality Related to multi-modality (#4194) label May 15, 2025

Isotr0py commented May 15, 2025

View reviewed changes

Merge branch 'vllm-project:main' into audio-test

cfd48e8

DarkLight1337 approved these changes May 16, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) May 16, 2025 07:18

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label May 16, 2025

DarkLight1337 merged commit 390ec88 into vllm-project:main May 16, 2025
57 checks passed

Isotr0py deleted the audio-test branch May 16, 2025 09:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Misc] Consolidate Audio tests into multimodal common generation tests #18214

[Misc] Consolidate Audio tests into multimodal common generation tests #18214

Isotr0py commented May 15, 2025 •

edited by github-actions bot

Loading

github-actions bot commented May 15, 2025

Isotr0py May 15, 2025

DarkLight1337 left a comment

[Misc] Consolidate Audio tests into multimodal common generation tests #18214

[Misc] Consolidate Audio tests into multimodal common generation tests #18214

Conversation

Isotr0py commented May 15, 2025 • edited by github-actions bot Loading

github-actions bot commented May 15, 2025

Isotr0py May 15, 2025

Choose a reason for hiding this comment

DarkLight1337 left a comment

Choose a reason for hiding this comment

Isotr0py commented May 15, 2025 •

edited by github-actions bot

Loading