Skip to content

[CI] Make JSON output tests less likely to fail #17859

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

russellb
Copy link
Member

@russellb russellb commented May 8, 2025

We occasionally see the JSON format structured output tests fail in CI.
PR #17490 included a change to the prompts asking to make the response
as short as possible. This change includes a couple more things to help:

  • Increase the output length limit. The failures occur when we cut off
    the output before a JSON object is properly terminated.

  • Set additionalProperties to False in each JSON schema used. This
    should restrict the model from adding properties not specified in the
    schemas, unnecessarily increasing the size of the JSON object output
    and making it more likely to hit the length limit before it finishes.

Signed-off-by: Russell Bryant rbryant@redhat.com

@russellb russellb requested a review from mgoin as a code owner May 8, 2025 13:20
Copy link

github-actions bot commented May 8, 2025

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@mergify mergify bot added the v1 label May 8, 2025
Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) May 8, 2025 13:21
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label May 8, 2025
@DarkLight1337
Copy link
Member

Hmm the test is still failing

@russellb
Copy link
Member Author

russellb commented May 8, 2025

Hmm the test is still failing

The part that failed is a spot I missed in my changes. I pushed an update. Hopefully this time it's green ...

@DarkLight1337
Copy link
Member

There seems to be yet another error

@russellb
Copy link
Member Author

russellb commented May 9, 2025

This error in the last failure is:

[2025-05-08T18:07:45Z] /usr/local/lib/python3.12/dist-packages/schemathesis/generation/coverage.py:305: DeprecationWarning: jsonschema.exceptions.RefResolutionError is deprecated as of version 4.18.0. If you wish to catch potential reference resolution errors, directly catch referencing.exceptions.Unresolvable.
  | [2025-05-08T18:07:45Z] ref_error: type[Exception] = jsonschema.RefResolutionError,

This was the test:

[2025-05-08T18:07:45Z] FAILED v1/entrypoints/llm/test_struct_output_generate.py::test_structured_output[meta-llama/Meta-Llama-3.1-8B-Instruct-xgrammar-auto-speculative_config6] - vllm.v1.engine.exceptions.EngineDeadError: EngineCore encountered an issue. See stack trace (above) for the root cause.  

I'll dig into it ...

@russellb
Copy link
Member Author

russellb commented May 9, 2025

I ran the failed test in a loop for about an hour and never hit the same failure. I did hit a different one, but it turned out to be a known issue in xgrammar: mlc-ai/xgrammar#286. It's allowing invalid characters inside of a JSON string, so that's probably causing CI failures on occasion, as well.

I think the current PR is probably a net improvement, though there is still work to do.

@Stranger6667
Copy link

@russellb

[2025-05-08T18:07:45Z] /usr/local/lib/python3.12/dist-packages/schemathesis/generation/coverage.py:305: DeprecationWarning: jsonschema.exceptions.RefResolutionError is deprecated as of version 4.18.0. If you wish to catch potential reference resolution errors, directly catch referencing.exceptions.Unresolvable.
| [2025-05-08T18:07:45Z] ref_error: type[Exception] = jsonschema.RefResolutionError,

FWIW, from the Schemathesis side, you can safely ignore that specific warning, the jsonschema version is pinned in Schemathesis, and the removal of RefResolutionError will only happen in jsonschema==5. Probably, I need to add this warning filter to Schemathesis to avoid irrelevant output

russellb added 3 commits May 12, 2025 16:42
We occasionally see the JSON format structured output tests fail in CI.
PR vllm-project#17490 included a change to the prompts asking to make the response
as short as possible. This change includes a couple more things to help:

- Increase the output length limit. The failures occur when we cut off
  the output before a JSON object is properly terminated.

- Set `additionalProperties` to `False` in each JSON schema used. This
  should restrict the model from adding properties not specified in the
  schemas, unnecessarily increasing the size of the JSON object output
  and making it more likely to hit the length limit before it finishes.

Signed-off-by: Russell Bryant <rbryant@redhat.com>
…roperties

Signed-off-by: Russell Bryant <rbryant@redhat.com>
workaround for mlc-ai/xgrammar#286
to avoid CI failures

Signed-off-by: Russell Bryant <rbryant@redhat.com>
@russellb russellb force-pushed the expand-output-length-for-json-tests branch from a8da442 to 57337db Compare May 12, 2025 16:42
@DarkLight1337 DarkLight1337 merged commit ebab1ac into vllm-project:main May 12, 2025
40 of 41 checks passed
RonaldBXu pushed a commit to RonaldBXu/vllm that referenced this pull request May 12, 2025
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Ronald Xu <ronaldxu@amazon.com>
mawong-amd pushed a commit to ROCm/vllm that referenced this pull request May 14, 2025
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready ONLY add when PR is ready to merge/full CI is needed v1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants