-
-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature - avoid utf-8 decoding for text frames #1376
Comments
I understand your use case and, indeed, you cannot do this with the current API. For receiving frames, it would mean an API like This raises the question of providing a symmetrical API for sending |
That would work quite well. I guess I misunderstood the code, because it looked to me as if the recv() method is decoupled from where the actual processing of inbound data (read_message()). The solution you propose would certainly be more flexible. |
just thinking about the send side, I think it really is less important. there aren't too many servers that are strict in what they accept, especially when they are expecting text. I think if we implement it for send, while the effect is the same (skip encode, skip decode), but the names of the options will be different, e.g: decode_text_frames=False for recv(), and send_as_text=True for send() |
Yes, we need to pick the names for both sides carefully and, ideally, consistently.
If we have two names, I'd like some symmetry e.g. using the words |
I'm finding myself in the same position, trying to send data encoded with Any chance this gets added? |
This will be added as part of the new asyncio implementation (#1332). |
Also support decoding binary frames. Fix #1376.
Also support decoding binary frames. Fix #1376.
Also support decoding binary frames. Fix #1376.
The new asyncio implementations supports (Also I'm not planning to work on the other features discussed above, notably |
|
Previously, a latch was used to synchronize the user thread reading messages and the background thread reading from the network. This required two thread switches per message. Now, the background thread writes messages to queue, from which the user thread reads. This allows passing several frames at each thread switch, reducing the overhead. With this server code:: async def test(websocket): for i in range(int(await websocket.recv())): await websocket.send(f"{{\"iteration\": {i}}}") and this client code:: with connect("ws://localhost:8765", compression=None) as websocket: websocket.send("1_000_000") for message in websocket: pass an unscientific benchmark (running it on my laptop) shows a 2.5x speedup, going from 11 seconds to 4.4 seconds. Setting a very large recv_bufsize and max_size doesn't yield significant further improvement. The new implementation mirrors the asyncio implementation and gains the option to prevent or force decoding of frames. Refs #1376.
Previously, a latch was used to synchronize the user thread reading messages and the background thread reading from the network. This required two thread switches per message. Now, the background thread writes messages to queue, from which the user thread reads. This allows passing several frames at each thread switch, reducing the overhead. With this server code: async def test(websocket): for i in range(int(await websocket.recv())): await websocket.send(f"{{\"iteration\": {i}}}") async with serve(test, "localhost", 8765) as server: await server.serve_forever() and this client code: with connect("ws://localhost:8765", compression=None) as websocket: websocket.send("1_000_000") for message in websocket: pass an unscientific benchmark (running it on my laptop) shows a 2.5x speedup, going from 11 seconds to 4.4 seconds. Setting a very large recv_bufsize and max_size doesn't yield significant further improvement. Flow control was tested by inserting debug logs in maybe_pause/resume() and by measuring the wait for the recv_flow_control lock. It showed the expected behavior of pausing and unpausing coupled with some wait time. The new implementation mirrors the asyncio implementation and gains the option to prevent or force decoding of frames. Fix #1376 for the threading implementation.
just because it is supposed to be in utf-8, doesn't mean I prefer it in that form. specifically, my usecase, is giving the data to orjson, and passing it around as an orjson.Fragment().
Here are the documents for that use case.
https://github.com/ijl/orjson#deserialize
https://github.com/ijl/orjson#fragment
looking at websockets code, if such a capability were to be implemented, it seems like we'd want to add an flag to WebSocketCommonProtocol() and then use it to force binary around the time it decides on whether to decode it or not, located here:
https://github.com/python-websockets/websockets/blob/main/src/websockets/legacy/protocol.py#L1053
I'd be happy to whip up a patch in case you would consider this feature request.
The text was updated successfully, but these errors were encountered: