Inconsistent parsing behavior #182

anton-ryzhov · 2025-04-15T08:23:26Z

I want to report an inconsistent behavior we've observed. It affects production, was very confusing and hard to understand/debug.

Initially we had a replica set of hypercorn-based applications, and the same request was processed fine by the majority of instances, but not by one of a few others. The server didn't even log such requests.

After the investigation I've discovered that this limit is not always enforced. The library adds next chunk and tries to parse it before enforcing the limit. So if a request exceeds the limit, it will still be parsed successfully if it comes in one chunk (which is often the case).

So practically the same request could be parsed successfully or not depending on how the OS and TCP stack chunks it.

Here is a demo:
h11-issue-demo.zip

$ pip install --requirement requirements.txt
$ pytest
FAILED test_client.py::test_fails_slightly_above_limit[uvicorn_endpoint-False] - assert not True
FAILED test_client.py::test_fails_slightly_above_limit[hypercorn_endpoint-False] - assert not True
========================================================================================== 2 failed, 10 passed in 2.16s ===========================================================================================

Inconsistency highlight:

$ pytest -k test_fails_slightly_above_limit
FAILED test_client.py::test_fails_slightly_above_limit[uvicorn_endpoint-False] - assert not True
FAILED test_client.py::test_fails_slightly_above_limit[hypercorn_endpoint-False] - assert not True
==================================================================================== 2 failed, 2 passed, 8 deselected in 0.84s ====================================================================================

I think it should be more stable and consistent. Should the library check buffer size before the parsing attempt? What do you think?

The text was updated successfully, but these errors were encountered:

anton-ryzhov · 2025-04-15T08:29:00Z

This affects at least uvicorn and hypercorn. CC @Kludex @pgjones

njsmith · 2025-04-24T03:38:07Z

Seems like a reasonable thing to fix, want to send a PR?

anton-ryzhov · 2025-05-19T21:43:26Z

I've started with tests and has found this and this. So it explicitly ensures this behavior. What was the motivation for such handling, do you remember?

The code is there almost since the very beginning BTW

njsmith · 2025-05-19T22:03:36Z

The main purpose the limit exists in the first place is to prevent attackers trying to perform memory exhaustion attacks or trigger super-linear parsing times. I wasn't thinking much about consistency. The current behavior works fine for that. It's also important to allow arbitrarily large network chunk sizes for performance.

I guess looking at this again, this might be pretty difficult to fix without breaking valid code or causing a lot of performance overhead...

Not sure if it would make more sense to try to fix this here, or to improve hypercorn's logging/error-response here so if you do hit it then it's easy to debug...

anton-ryzhov · 2025-05-20T22:19:06Z

The main purpose the limit exists in the first place is to prevent attackers trying to perform memory exhaustion attacks or trigger super-linear parsing times.

Yeah, this reason was clear.

It's also important to allow arbitrarily large network chunk sizes for performance

But what's the reason for that? Shouldn't the library check the _receive_buffer size in receive_data to protect it from uncontrolled growth?

Something like

connection.receive_data(b"GET /" + one_gb_of_chars + " HTTP/1.0\r\n\r\n")
connection.next_event()

won't help

to prevent attackers trying to perform memory exhaustion attacks or trigger super-linear parsing times

If it's expected that an application won't call receive_data with crazy-big buffers — is it maybe a good idea to limit buffer to max_incomplete_event_size?

Not sure if it would make more sense to try to fix this here, or to improve hypercorn's logging/error-response here so if you do hit it then it's easy to debug...

That affects other servers too, uvicorn at least. Logging there definitely could be improved, but won't make the case less confusing.

In our case the server would raise an error for 1-2% of exactly the same requests. I'd prefer to have them all failed (as they all practically violate max_incomplete_event_size limit) to see that in monitoring and fix the client

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent parsing behavior #182

Inconsistent parsing behavior #182

anton-ryzhov commented Apr 15, 2025 •

edited

Loading

anton-ryzhov commented Apr 15, 2025

njsmith commented Apr 24, 2025

anton-ryzhov commented May 19, 2025

njsmith commented May 19, 2025

anton-ryzhov commented May 20, 2025

Inconsistent parsing behavior #182

Inconsistent parsing behavior #182

Comments

anton-ryzhov commented Apr 15, 2025 • edited Loading

anton-ryzhov commented Apr 15, 2025

njsmith commented Apr 24, 2025

anton-ryzhov commented May 19, 2025

njsmith commented May 19, 2025

anton-ryzhov commented May 20, 2025

anton-ryzhov commented Apr 15, 2025 •

edited

Loading