Skip to content

Inconsistent parsing behavior #182

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
anton-ryzhov opened this issue Apr 15, 2025 · 5 comments
Open

Inconsistent parsing behavior #182

anton-ryzhov opened this issue Apr 15, 2025 · 5 comments

Comments

@anton-ryzhov
Copy link

anton-ryzhov commented Apr 15, 2025

I want to report an inconsistent behavior we've observed. It affects production, was very confusing and hard to understand/debug.

Initially we had a replica set of hypercorn-based applications, and the same request was processed fine by the majority of instances, but not by one of a few others. The server didn't even log such requests.

After the investigation I've discovered that this limit is not always enforced. The library adds next chunk and tries to parse it before enforcing the limit. So if a request exceeds the limit, it will still be parsed successfully if it comes in one chunk (which is often the case).

So practically the same request could be parsed successfully or not depending on how the OS and TCP stack chunks it.

Here is a demo:
h11-issue-demo.zip

$ pip install --requirement requirements.txt
$ pytest
FAILED test_client.py::test_fails_slightly_above_limit[uvicorn_endpoint-False] - assert not True
FAILED test_client.py::test_fails_slightly_above_limit[hypercorn_endpoint-False] - assert not True
========================================================================================== 2 failed, 10 passed in 2.16s ===========================================================================================

Inconsistency highlight:

$ pytest -k test_fails_slightly_above_limit
FAILED test_client.py::test_fails_slightly_above_limit[uvicorn_endpoint-False] - assert not True
FAILED test_client.py::test_fails_slightly_above_limit[hypercorn_endpoint-False] - assert not True
==================================================================================== 2 failed, 2 passed, 8 deselected in 0.84s ====================================================================================

I think it should be more stable and consistent. Should the library check buffer size before the parsing attempt? What do you think?

@anton-ryzhov
Copy link
Author

This affects at least uvicorn and hypercorn. CC @Kludex @pgjones

@njsmith
Copy link
Member

njsmith commented Apr 24, 2025

Seems like a reasonable thing to fix, want to send a PR?

@anton-ryzhov
Copy link
Author

I've started with tests and has found this and this. So it explicitly ensures this behavior. What was the motivation for such handling, do you remember?

The code is there almost since the very beginning BTW

@njsmith
Copy link
Member

njsmith commented May 19, 2025

The main purpose the limit exists in the first place is to prevent attackers trying to perform memory exhaustion attacks or trigger super-linear parsing times. I wasn't thinking much about consistency. The current behavior works fine for that. It's also important to allow arbitrarily large network chunk sizes for performance.

I guess looking at this again, this might be pretty difficult to fix without breaking valid code or causing a lot of performance overhead...

Not sure if it would make more sense to try to fix this here, or to improve hypercorn's logging/error-response here so if you do hit it then it's easy to debug...

@anton-ryzhov
Copy link
Author

The main purpose the limit exists in the first place is to prevent attackers trying to perform memory exhaustion attacks or trigger super-linear parsing times.

Yeah, this reason was clear.

It's also important to allow arbitrarily large network chunk sizes for performance

But what's the reason for that? Shouldn't the library check the _receive_buffer size in receive_data to protect it from uncontrolled growth?

Something like

connection.receive_data(b"GET /" + one_gb_of_chars + " HTTP/1.0\r\n\r\n")
connection.next_event()

won't help

to prevent attackers trying to perform memory exhaustion attacks or trigger super-linear parsing times

If it's expected that an application won't call receive_data with crazy-big buffers — is it maybe a good idea to limit buffer to max_incomplete_event_size?

Not sure if it would make more sense to try to fix this here, or to improve hypercorn's logging/error-response here so if you do hit it then it's easy to debug...

That affects other servers too, uvicorn at least. Logging there definitely could be improved, but won't make the case less confusing.

In our case the server would raise an error for 1-2% of exactly the same requests. I'd prefer to have them all failed (as they all practically violate max_incomplete_event_size limit) to see that in monitoring and fix the client

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants