[CI] Make JSON output tests less likely to fail #17859

russellb · 2025-05-08T13:20:06Z

We occasionally see the JSON format structured output tests fail in CI.
PR #17490 included a change to the prompts asking to make the response
as short as possible. This change includes a couple more things to help:

Increase the output length limit. The failures occur when we cut off
the output before a JSON object is properly terminated.
Set additionalProperties to False in each JSON schema used. This
should restrict the model from adding properties not specified in the
schemas, unnecessarily increasing the size of the JSON object output
and making it more likely to hit the length limit before it finishes.

Signed-off-by: Russell Bryant rbryant@redhat.com

github-actions · 2025-05-08T13:20:17Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

DarkLight1337

LGTM thanks

DarkLight1337 · 2025-05-08T16:14:44Z

Hmm the test is still failing

russellb · 2025-05-08T16:23:44Z

Hmm the test is still failing

The part that failed is a spot I missed in my changes. I pushed an update. Hopefully this time it's green ...

DarkLight1337 · 2025-05-09T03:15:14Z

There seems to be yet another error

russellb · 2025-05-09T13:52:31Z

This error in the last failure is:

[2025-05-08T18:07:45Z] /usr/local/lib/python3.12/dist-packages/schemathesis/generation/coverage.py:305: DeprecationWarning: jsonschema.exceptions.RefResolutionError is deprecated as of version 4.18.0. If you wish to catch potential reference resolution errors, directly catch referencing.exceptions.Unresolvable.
| [2025-05-08T18:07:45Z] ref_error: type[Exception] = jsonschema.RefResolutionError,

This was the test:

[2025-05-08T18:07:45Z] FAILED v1/entrypoints/llm/test_struct_output_generate.py::test_structured_output[meta-llama/Meta-Llama-3.1-8B-Instruct-xgrammar-auto-speculative_config6] - vllm.v1.engine.exceptions.EngineDeadError: EngineCore encountered an issue. See stack trace (above) for the root cause.

I'll dig into it ...

russellb · 2025-05-09T15:24:44Z

I ran the failed test in a loop for about an hour and never hit the same failure. I did hit a different one, but it turned out to be a known issue in xgrammar: mlc-ai/xgrammar#286. It's allowing invalid characters inside of a JSON string, so that's probably causing CI failures on occasion, as well.

I think the current PR is probably a net improvement, though there is still work to do.

Stranger6667 · 2025-05-10T13:28:06Z

@russellb

[2025-05-08T18:07:45Z] /usr/local/lib/python3.12/dist-packages/schemathesis/generation/coverage.py:305: DeprecationWarning: jsonschema.exceptions.RefResolutionError is deprecated as of version 4.18.0. If you wish to catch potential reference resolution errors, directly catch referencing.exceptions.Unresolvable.
| [2025-05-08T18:07:45Z] ref_error: type[Exception] = jsonschema.RefResolutionError,

FWIW, from the Schemathesis side, you can safely ignore that specific warning, the jsonschema version is pinned in Schemathesis, and the removal of RefResolutionError will only happen in jsonschema==5. Probably, I need to add this warning filter to Schemathesis to avoid irrelevant output

We occasionally see the JSON format structured output tests fail in CI. PR vllm-project#17490 included a change to the prompts asking to make the response as short as possible. This change includes a couple more things to help: - Increase the output length limit. The failures occur when we cut off the output before a JSON object is properly terminated. - Set `additionalProperties` to `False` in each JSON schema used. This should restrict the model from adding properties not specified in the schemas, unnecessarily increasing the size of the JSON object output and making it more likely to hit the length limit before it finishes. Signed-off-by: Russell Bryant <rbryant@redhat.com>

…roperties Signed-off-by: Russell Bryant <rbryant@redhat.com>

workaround for mlc-ai/xgrammar#286 to avoid CI failures Signed-off-by: Russell Bryant <rbryant@redhat.com>

Signed-off-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: Ronald Xu <ronaldxu@amazon.com>

Signed-off-by: Russell Bryant <rbryant@redhat.com>

russellb requested a review from mgoin as a code owner May 8, 2025 13:20

mergify bot added the v1 label May 8, 2025

DarkLight1337 approved these changes May 8, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) May 8, 2025 13:21

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label May 8, 2025

russellb added 3 commits May 12, 2025 16:42

Fix a few spots i missed - expand output length, disable additional p…

d820a9f

…roperties Signed-off-by: Russell Bryant <rbryant@redhat.com>

string invalid JSON characters that xgrammar allows

57337db

workaround for mlc-ai/xgrammar#286 to avoid CI failures Signed-off-by: Russell Bryant <rbryant@redhat.com>

russellb force-pushed the expand-output-length-for-json-tests branch from a8da442 to 57337db Compare May 12, 2025 16:42

DarkLight1337 merged commit ebab1ac into vllm-project:main May 12, 2025
40 of 41 checks passed

RonaldBXu pushed a commit to RonaldBXu/vllm that referenced this pull request May 12, 2025

[CI] Make JSON output tests less likely to fail (vllm-project#17859)

af18e96

Signed-off-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: Ronald Xu <ronaldxu@amazon.com>

mawong-amd pushed a commit to ROCm/vllm that referenced this pull request May 14, 2025

[CI] Make JSON output tests less likely to fail (vllm-project#17859)

736946b

Signed-off-by: Russell Bryant <rbryant@redhat.com>

NickLucche mentioned this pull request May 15, 2025

[PD] Heterogenous TP + #7 robertgshaw2-redhat/vllm#14

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI] Make JSON output tests less likely to fail #17859

[CI] Make JSON output tests less likely to fail #17859

russellb commented May 8, 2025

github-actions bot commented May 8, 2025

DarkLight1337 left a comment

DarkLight1337 commented May 8, 2025

russellb commented May 8, 2025

DarkLight1337 commented May 9, 2025

russellb commented May 9, 2025

russellb commented May 9, 2025

Stranger6667 commented May 10, 2025

[CI] Make JSON output tests less likely to fail #17859

[CI] Make JSON output tests less likely to fail #17859

Conversation

russellb commented May 8, 2025

github-actions bot commented May 8, 2025

DarkLight1337 left a comment

Choose a reason for hiding this comment

DarkLight1337 commented May 8, 2025

russellb commented May 8, 2025

DarkLight1337 commented May 9, 2025

russellb commented May 9, 2025

russellb commented May 9, 2025

Stranger6667 commented May 10, 2025