Skip to content

Commit 5eafbe4

Browse files
committed
Rewrite documentation of buffers.
Describe all implementations. Also update documentation of compression.
1 parent 8385cf0 commit 5eafbe4

File tree

9 files changed

+316
-224
lines changed

9 files changed

+316
-224
lines changed

.gitignore

+1-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
.tox
99
build/
1010
compliance/reports/
11-
experiments/compression/corpus.pkl
11+
experiments/compression/corpus/
1212
dist/
1313
docs/_build/
1414
htmlcov/

docs/topics/compression.rst

+96-77
Original file line numberDiff line numberDiff line change
@@ -7,37 +7,36 @@ Most WebSocket servers exchange JSON messages because they're convenient to
77
parse and serialize in a browser. These messages contain text data and tend to
88
be repetitive.
99

10-
This makes the stream of messages highly compressible. Enabling compression
10+
This makes the stream of messages highly compressible. Compressing messages
1111
can reduce network traffic by more than 80%.
1212

13-
There's a standard for compressing messages. :rfc:`7692` defines WebSocket
14-
Per-Message Deflate, a compression extension based on the Deflate_ algorithm.
13+
websockets implements WebSocket Per-Message Deflate, a compression extension
14+
based on the Deflate_ algorithm specified in :rfc:`7692`.
1515

1616
.. _Deflate: https://en.wikipedia.org/wiki/Deflate
1717

18-
Configuring compression
19-
-----------------------
18+
:func:`~websockets.asyncio.client.connect` and
19+
:func:`~websockets.asyncio.server.serve` enable compression by default because
20+
the reduction in network bandwidth is usually worth the additional memory and
21+
CPU cost.
2022

21-
:func:`~websockets.client.connect` and :func:`~websockets.server.serve` enable
22-
compression by default because the reduction in network bandwidth is usually
23-
worth the additional memory and CPU cost.
2423

25-
If you want to disable compression, set ``compression=None``::
24+
Configuring compression
25+
-----------------------
2626

27-
import websockets
27+
To disable compression, set ``compression=None``::
2828

29-
websockets.connect(..., compression=None)
29+
connect(..., compression=None, ...)
3030

31-
websockets.serve(..., compression=None)
31+
serve(..., compression=None, ...)
3232

33-
If you want to customize compression settings, you can enable the Per-Message
34-
Deflate extension explicitly with :class:`ClientPerMessageDeflateFactory` or
33+
To customize compression settings, enable the Per-Message Deflate extension
34+
explicitly with :class:`ClientPerMessageDeflateFactory` or
3535
:class:`ServerPerMessageDeflateFactory`::
3636

37-
import websockets
3837
from websockets.extensions import permessage_deflate
3938

40-
websockets.connect(
39+
connect(
4140
...,
4241
extensions=[
4342
permessage_deflate.ClientPerMessageDeflateFactory(
@@ -46,9 +45,10 @@ Deflate extension explicitly with :class:`ClientPerMessageDeflateFactory` or
4645
compress_settings={"memLevel": 4},
4746
),
4847
],
48+
...,
4949
)
5050

51-
websockets.serve(
51+
serve(
5252
...,
5353
extensions=[
5454
permessage_deflate.ServerPerMessageDeflateFactory(
@@ -57,13 +57,14 @@ Deflate extension explicitly with :class:`ClientPerMessageDeflateFactory` or
5757
compress_settings={"memLevel": 4},
5858
),
5959
],
60+
...,
6061
)
6162

6263
The Window Bits and Memory Level values in these examples reduce memory usage
6364
at the expense of compression rate.
6465

65-
Compression settings
66-
--------------------
66+
Compression parameters
67+
----------------------
6768

6869
When a client and a server enable the Per-Message Deflate extension, they
6970
negotiate two parameters to guarantee compatibility between compression and
@@ -81,9 +82,9 @@ and memory usage for both sides.
8182
This requires retaining the compression context and state between messages,
8283
which increases the memory footprint of a connection.
8384

84-
* **Window Bits** controls the size of the compression context. It must be
85-
an integer between 9 (lowest memory usage) and 15 (best compression).
86-
Setting it to 8 is possible but rejected by some versions of zlib.
85+
* **Window Bits** controls the size of the compression context. It must be an
86+
integer between 9 (lowest memory usage) and 15 (best compression). Setting it
87+
to 8 is possible but rejected by some versions of zlib and not very useful.
8788

8889
On the server side, websockets defaults to 12. Specifically, the compression
8990
window size (server to client) is always 12 while the decompression window
@@ -94,9 +95,8 @@ and memory usage for both sides.
9495
has the same effect as defaulting to 15.
9596

9697
:mod:`zlib` offers additional parameters for tuning compression. They control
97-
the trade-off between compression rate, memory usage, and CPU usage only for
98-
compressing. They're transparent for decompressing. Unless mentioned
99-
otherwise, websockets inherits defaults of :func:`~zlib.compressobj`.
98+
the trade-off between compression rate, memory usage, and CPU usage for
99+
compressing. They're transparent for decompressing.
100100

101101
* **Memory Level** controls the size of the compression state. It must be an
102102
integer between 1 (lowest memory usage) and 9 (best compression).
@@ -108,96 +108,92 @@ otherwise, websockets inherits defaults of :func:`~zlib.compressobj`.
108108
* **Compression Level** controls the effort to optimize compression. It must
109109
be an integer between 1 (lowest CPU usage) and 9 (best compression).
110110

111+
websockets relies on the default value chosen by :func:`~zlib.compressobj`,
112+
``Z_DEFAULT_COMPRESSION``.
113+
111114
* **Strategy** selects the compression strategy. The best choice depends on
112115
the type of data being compressed.
113116

117+
websockets relies on the default value chosen by :func:`~zlib.compressobj`,
118+
``Z_DEFAULT_STRATEGY``.
114119

115-
Tuning compression
116-
------------------
120+
To customize these parameters, add keyword arguments for
121+
:func:`~zlib.compressobj` in ``compress_settings``.
117122

118-
For servers
119-
...........
123+
Default settings for servers
124+
----------------------------
120125

121126
By default, websockets enables compression with conservative settings that
122127
optimize memory usage at the cost of a slightly worse compression rate:
123-
Window Bits = 12 and Memory Level = 5. This strikes a good balance for small
128+
Window Bits = 12 and Memory Level = 5. This strikes a good balance for small
124129
messages that are typical of WebSocket servers.
125130

126-
Here's how various compression settings affect memory usage of a single
127-
connection on a 64-bit system, as well a benchmark of compressed size and
128-
compression time for a corpus of small JSON documents.
131+
Here's an example of how compression settings affect memory usage per
132+
connection, compressed size, and compression time for a corpus of JSON
133+
documents.
129134

130135
=========== ============ ============ ================ ================
131136
Window Bits Memory Level Memory usage Size vs. default Time vs. default
132137
=========== ============ ============ ================ ================
133-
15 8 322 KiB -4.0% +15%
134-
14 7 178 KiB -2.6% +10%
135-
13 6 106 KiB -1.4% +5%
136-
**12** **5** **70 KiB** **=** **=**
137-
11 4 52 KiB +3.7% -5%
138-
10 3 43 KiB +90% +50%
139-
9 2 39 KiB +160% +100%
140-
— — 19 KiB +452% —
138+
15 8 316 KiB -10% +10%
139+
14 7 172 KiB -7% +5%
140+
13 6 100 KiB -3% +2%
141+
**12** **5** **64 KiB** **=** **=**
142+
11 4 46 KiB +10% +4%
143+
10 3 37 KiB +70% +40%
144+
9 2 33 KiB +130% +90%
145+
— — 14 KiB +350% —
141146
=========== ============ ============ ================ ================
142147

143148
Window Bits and Memory Level don't have to move in lockstep. However, other
144149
combinations don't yield significantly better results than those shown above.
145150

146-
Compressed size and compression time depend heavily on the kind of messages
147-
exchanged by the application so this example may not apply to your use case.
148-
149-
You can adapt `compression/benchmark.py`_ by creating a list of typical
150-
messages and passing it to the ``_run`` function.
151-
152-
Window Bits = 11 and Memory Level = 4 looks like the sweet spot in this table.
153-
154-
websockets defaults to Window Bits = 12 and Memory Level = 5 to stay away from
155-
Window Bits = 10 or Memory Level = 3 where performance craters, raising doubts
156-
on what could happen at Window Bits = 11 and Memory Level = 4 on a different
151+
websockets defaults to Window Bits = 12 and Memory Level = 5 to stay away from
152+
Window Bits = 10 or Memory Level = 3 where performance craters, raising doubts
153+
on what could happen at Window Bits = 11 and Memory Level = 4 on a different
157154
corpus.
158155

159156
Defaults must be safe for all applications, hence a more conservative choice.
160157

161-
.. _compression/benchmark.py: https://github.com/python-websockets/websockets/blob/main/experiments/compression/benchmark.py
158+
Optimizing settings
159+
-------------------
162160

163-
The benchmark focuses on compression because it's more expensive than
164-
decompression. Indeed, leaving aside small allocations, theoretical memory
165-
usage is:
161+
Compressed size and compression time depend on the structure of messages
162+
exchanged by your application. As a consequence, default settings may not be
163+
optimal for your use case.
166164

167-
* ``(1 << (windowBits + 2)) + (1 << (memLevel + 9))`` for compression;
168-
* ``1 << windowBits`` for decompression.
165+
To compare how various compression settings perform for your use case:
169166

170-
CPU usage is also higher for compression than decompression.
167+
1. Create a corpus of typical messages in a directory, one message per file.
168+
2. Run the `compression/benchmark.py`_ script, passing the directory in
169+
argument.
171170

172-
While it's always possible for a server to use a smaller window size for
173-
compressing outgoing messages, using a smaller window size for decompressing
174-
incoming messages requires collaboration from clients.
171+
The script measures compressed size and compression time for all combinations of
172+
Window Bits and Memory Level. It outputs two tables with absolute values and two
173+
tables with values relative to websockets' default settings.
175174

176-
When a client doesn't support configuring the size of its compression window,
177-
websockets enables compression with the largest possible decompression window.
178-
In most use cases, this is more efficient than disabling compression both ways.
175+
Pick your favorite settings in these tables and configure them as shown above.
179176

180-
If you are very sensitive to memory usage, you can reverse this behavior by
181-
setting the ``require_client_max_window_bits`` parameter of
182-
:class:`ServerPerMessageDeflateFactory` to ``True``.
177+
.. _compression/benchmark.py: https://github.com/python-websockets/websockets/blob/main/experiments/compression/benchmark.py
183178

184-
For clients
185-
...........
179+
Default settings for clients
180+
----------------------------
186181

187-
By default, websockets enables compression with Memory Level = 5 but leaves
182+
By default, websockets enables compression with Memory Level = 5 but leaves
188183
the Window Bits setting up to the server.
189184

190-
There's two good reasons and one bad reason for not optimizing the client side
191-
like the server side:
185+
There's two good reasons and one bad reason for not optimizing Window Bits on
186+
the client side as on the server side:
192187

193188
1. If the maintainers of a server configured some optimized settings, we don't
194189
want to override them with more restrictive settings.
195190

196191
2. Optimizing memory usage doesn't matter very much for clients because it's
197192
uncommon to open thousands of client connections in a program.
198193

199-
3. On a more pragmatic note, some servers misbehave badly when a client
200-
configures compression settings. `AWS API Gateway`_ is the worst offender.
194+
3. On a more pragmatic and annoying note, some servers misbehave badly when a
195+
client configures compression settings. `AWS API Gateway`_ is the worst
196+
offender.
201197

202198
.. _AWS API Gateway: https://github.com/python-websockets/websockets/issues/1065
203199

@@ -207,6 +203,29 @@ like the server side:
207203
Until the ecosystem levels up, interoperability with buggy servers seems
208204
more valuable than optimizing memory usage.
209205

206+
Decompression
207+
-------------
208+
209+
The discussion above focuses on compression because it's more expensive than
210+
decompression. Indeed, leaving aside small allocations, theoretical memory
211+
usage is:
212+
213+
* ``(1 << (windowBits + 2)) + (1 << (memLevel + 9))`` for compression;
214+
* ``1 << windowBits`` for decompression.
215+
216+
CPU usage is also higher for compression than decompression.
217+
218+
While it's always possible for a server to use a smaller window size for
219+
compressing outgoing messages, using a smaller window size for decompressing
220+
incoming messages requires collaboration from clients.
221+
222+
When a client doesn't support configuring the size of its compression window,
223+
websockets enables compression with the largest possible decompression window.
224+
In most use cases, this is more efficient than disabling compression both ways.
225+
226+
If you are very sensitive to memory usage, you can reverse this behavior by
227+
setting the ``require_client_max_window_bits`` parameter of
228+
:class:`ServerPerMessageDeflateFactory` to ``True``.
210229

211230
Further reading
212231
---------------
@@ -216,7 +235,7 @@ settings affect memory usage and how to optimize them.
216235

217236
.. _blog post by Ilya Grigorik: https://www.igvita.com/2013/11/27/configuring-and-optimizing-websocket-compression/
218237

219-
This `experiment by Peter Thorson`_ recommends Window Bits = 11 and Memory
220-
Level = 4 for optimizing memory usage.
238+
This `experiment by Peter Thorson`_ recommends Window Bits = 11 and Memory
239+
Level = 4 for optimizing memory usage.
221240

222241
.. _experiment by Peter Thorson: https://mailarchive.ietf.org/arch/msg/hybi/F9t4uPufVEy8KBLuL36cZjCmM_Y/

docs/topics/design.rst

-49
Original file line numberDiff line numberDiff line change
@@ -488,55 +488,6 @@ they're drained. That's why all APIs that write frames are asynchronous.
488488
Of course, it's still possible for an application to create its own unbounded
489489
buffers and break the backpressure. Be careful with queues.
490490

491-
492-
.. _buffers:
493-
494-
Buffers
495-
-------
496-
497-
.. note::
498-
499-
This section discusses buffers from the perspective of a server but it
500-
applies to clients as well.
501-
502-
An asynchronous systems works best when its buffers are almost always empty.
503-
504-
For example, if a client sends data too fast for a server, the queue of
505-
incoming messages will be constantly full. The server will always be 32
506-
messages (by default) behind the client. This consumes memory and increases
507-
latency for no good reason. The problem is called bufferbloat.
508-
509-
If buffers are almost always full and that problem cannot be solved by adding
510-
capacity — typically because the system is bottlenecked by the output and
511-
constantly regulated by backpressure — reducing the size of buffers minimizes
512-
negative consequences.
513-
514-
By default websockets has rather high limits. You can decrease them according
515-
to your application's characteristics.
516-
517-
Bufferbloat can happen at every level in the stack where there is a buffer.
518-
For each connection, the receiving side contains these buffers:
519-
520-
- OS buffers: tuning them is an advanced optimization.
521-
- :class:`~asyncio.StreamReader` bytes buffer: the default limit is 64 KiB.
522-
You can set another limit by passing a ``read_limit`` keyword argument to
523-
:func:`~client.connect()` or :func:`~server.serve`.
524-
- Incoming messages :class:`~collections.deque`: its size depends both on
525-
the size and the number of messages it contains. By default the maximum
526-
UTF-8 encoded size is 1 MiB and the maximum number is 32. In the worst case,
527-
after UTF-8 decoding, a single message could take up to 4 MiB of memory and
528-
the overall memory consumption could reach 128 MiB. You should adjust these
529-
limits by setting the ``max_size`` and ``max_queue`` keyword arguments of
530-
:func:`~client.connect()` or :func:`~server.serve` according to your
531-
application's requirements.
532-
533-
For each connection, the sending side contains these buffers:
534-
535-
- :class:`~asyncio.StreamWriter` bytes buffer: the default size is 64 KiB.
536-
You can set another limit by passing a ``write_limit`` keyword argument to
537-
:func:`~client.connect()` or :func:`~server.serve`.
538-
- OS buffers: tuning them is an advanced optimization.
539-
540491
Concurrency
541492
-----------
542493

0 commit comments

Comments
 (0)