python-websockets
diff --git a/‎.gitignore
+1-1 b/‎.gitignore
+1-1
diff --git a/‎docs/topics/compression.rst
+96-77 b/‎docs/topics/compression.rst
+96-77
diff --git a/‎docs/topics/design.rst
-49 b/‎docs/topics/design.rst
-49
@@ -8,7 +8,7 @@
 .tox
 build/
 compliance/reports/
-experiments/compression/corpus.pkl
+experiments/compression/corpus/
 dist/
 docs/_build/
 htmlcov/
 
@@ -7,37 +7,36 @@ Most WebSocket servers exchange JSON messages because they're convenient to
 parse and serialize in a browser. These messages contain text data and tend to
 be repetitive.
 
-This makes the stream of messages highly compressible. Enabling compression
+This makes the stream of messages highly compressible. Compressing messages
 can reduce network traffic by more than 80%.
 
-There's a standard for compressing messages. :rfc:`7692` defines WebSocket
-Per-Message Deflate, a compression extension based on the Deflate_ algorithm.
+websockets implements WebSocket Per-Message Deflate, a compression extension
+based on the Deflate_ algorithm specified in :rfc:`7692`.
 
 .. _Deflate: https://en.wikipedia.org/wiki/Deflate
 
-Configuring compression
------------------------
+:func:`~websockets.asyncio.client.connect` and
+:func:`~websockets.asyncio.server.serve` enable compression by default because
+the reduction in network bandwidth is usually worth the additional memory and
+CPU cost.
 
-:func:`~websockets.client.connect` and :func:`~websockets.server.serve` enable
-compression by default because the reduction in network bandwidth is usually
-worth the additional memory and CPU cost.
 
-If you want to disable compression, set ``compression=None``::
+Configuring compression
+-----------------------
 
-    import websockets
+To disable compression, set ``compression=None``::
 
-    websockets.connect(..., compression=None)
+    connect(..., compression=None, ...)
 
-    websockets.serve(..., compression=None)
+    serve(..., compression=None, ...)
 
-If you want to customize compression settings, you can enable the Per-Message
-Deflate extension explicitly with :class:`ClientPerMessageDeflateFactory` or
+To customize compression settings, enable the Per-Message Deflate extension
+explicitly with :class:`ClientPerMessageDeflateFactory` or
 :class:`ServerPerMessageDeflateFactory`::
 
-    import websockets
     from websockets.extensions import permessage_deflate
 
-    websockets.connect(
+    connect(
         ...,
         extensions=[
             permessage_deflate.ClientPerMessageDeflateFactory(
@@ -46,9 +45,10 @@ Deflate extension explicitly with :class:`ClientPerMessageDeflateFactory` or
                 compress_settings={"memLevel": 4},
             ),
         ],
+        ...,
     )
 
-    websockets.serve(
+    serve(
         ...,
         extensions=[
             permessage_deflate.ServerPerMessageDeflateFactory(
@@ -57,13 +57,14 @@ Deflate extension explicitly with :class:`ClientPerMessageDeflateFactory` or
                 compress_settings={"memLevel": 4},
             ),
         ],
+        ...,
     )
 
 The Window Bits and Memory Level values in these examples reduce memory usage
 at the expense of compression rate.
 
-Compression settings
---------------------
+Compression parameters
+----------------------
 
 When a client and a server enable the Per-Message Deflate extension, they
 negotiate two parameters to guarantee compatibility between compression and
@@ -81,9 +82,9 @@ and memory usage for both sides.
   This requires retaining the compression context and state between messages,
   which increases the memory footprint of a connection.
 
-* **Window Bits** controls the size of the compression context. It must be
-  an integer between 9 (lowest memory usage) and 15 (best compression).
-  Setting it to 8 is possible but rejected by some versions of zlib.
+* **Window Bits** controls the size of the compression context. It must be an
+  integer between 9 (lowest memory usage) and 15 (best compression). Setting it
+  to 8 is possible but rejected by some versions of zlib and not very useful.
 
   On the server side, websockets defaults to 12. Specifically, the compression
   window size (server to client) is always 12 while the decompression window
@@ -94,9 +95,8 @@ and memory usage for both sides.
   has the same effect as defaulting to 15.
 
 :mod:`zlib` offers additional parameters for tuning compression. They control
-the trade-off between compression rate, memory usage, and CPU usage only for
-compressing. They're transparent for decompressing. Unless mentioned
-otherwise, websockets inherits defaults of :func:`~zlib.compressobj`.
+the trade-off between compression rate, memory usage, and CPU usage for
+compressing. They're transparent for decompressing.
 
 * **Memory Level** controls the size of the compression state. It must be an
   integer between 1 (lowest memory usage) and 9 (best compression).
@@ -108,96 +108,92 @@ otherwise, websockets inherits defaults of :func:`~zlib.compressobj`.
 * **Compression Level** controls the effort to optimize compression. It must
   be an integer between 1 (lowest CPU usage) and 9 (best compression).
 
+  websockets relies on the default value chosen by :func:`~zlib.compressobj`,
+  ``Z_DEFAULT_COMPRESSION``.
+
 * **Strategy** selects the compression strategy. The best choice depends on
   the type of data being compressed.
 
+  websockets relies on the default value chosen by :func:`~zlib.compressobj`,
+  ``Z_DEFAULT_STRATEGY``.
 
-Tuning compression
-------------------
+To customize these parameters, add keyword arguments for
+:func:`~zlib.compressobj` in ``compress_settings``.
 
-For servers
-...........
+Default settings for servers
+----------------------------
 
 By default, websockets enables compression with conservative settings that
 optimize memory usage at the cost of a slightly worse compression rate:
-Window Bits = 12 and Memory Level = 5. This strikes a good balance for small
+Window Bits = 12 and Memory Level = 5. This strikes a good balance for small
 messages that are typical of WebSocket servers.
 
-Here's how various compression settings affect memory usage of a single
-connection on a 64-bit system, as well a benchmark of compressed size and
-compression time for a corpus of small JSON documents.
+Here's an example of how compression settings affect memory usage per
+connection, compressed size, and compression time for a corpus of JSON
+documents.
 
 =========== ============ ============ ================ ================
 Window Bits Memory Level Memory usage Size vs. default Time vs. default
 =========== ============ ============ ================ ================
-15          8            322 KiB      -4.0%            +15%
-14          7            178 KiB      -2.6%            +10%
-13          6            106 KiB      -1.4%            +5%
-**12**      **5**        **70 KiB**   **=**            **=**
-11          4            52 KiB       +3.7%            -5%
-10          3            43 KiB       +90%             +50%
-9           2            39 KiB       +160%            +100%
-—           —            19 KiB       +452%            —
+15          8            316 KiB      -10%             +10%
+14          7            172 KiB      -7%              +5%
+13          6            100 KiB       -3%             +2%
+**12**      **5**        **64 KiB**   **=**            **=**
+11          4            46 KiB       +10%             +4%
+10          3            37 KiB       +70%             +40%
+9           2            33 KiB       +130%            +90%
+—           —            14 KiB       +350%            —
 =========== ============ ============ ================ ================
 
 Window Bits and Memory Level don't have to move in lockstep. However, other
 combinations don't yield significantly better results than those shown above.
 
-Compressed size and compression time depend heavily on the kind of messages
-exchanged by the application so this example may not apply to your use case.
-
-You can adapt `compression/benchmark.py`_ by creating a list of typical
-messages and passing it to the ``_run`` function.
-
-Window Bits = 11 and Memory Level = 4 looks like the sweet spot in this table.
-
-websockets defaults to Window Bits = 12 and Memory Level = 5 to stay away from
-Window Bits = 10 or Memory Level = 3 where performance craters, raising doubts
-on what could happen at Window Bits = 11 and Memory Level = 4 on a different
+websockets defaults to Window Bits = 12 and Memory Level = 5 to stay away from
+Window Bits = 10 or Memory Level = 3 where performance craters, raising doubts
+on what could happen at Window Bits = 11 and Memory Level = 4 on a different
 corpus.
 
 Defaults must be safe for all applications, hence a more conservative choice.
 
-.. _compression/benchmark.py: https://github.com/python-websockets/websockets/blob/main/experiments/compression/benchmark.py
+Optimizing settings
+-------------------
 
-The benchmark focuses on compression because it's more expensive than
-decompression. Indeed, leaving aside small allocations, theoretical memory
-usage is:
+Compressed size and compression time depend on the structure of messages
+exchanged by your application. As a consequence, default settings may not be
+optimal for your use case.
 
-* ``(1 << (windowBits + 2)) + (1 << (memLevel + 9))`` for compression;
-* ``1 << windowBits`` for decompression.
+To compare how various compression settings perform for your use case:
 
-CPU usage is also higher for compression than decompression.
+1. Create a corpus of typical messages in a directory, one message per file.
+2. Run the `compression/benchmark.py`_ script, passing the directory in
+   argument.
 
-While it's always possible for a server to use a smaller window size for
-compressing outgoing messages, using a smaller window size for decompressing
-incoming messages requires collaboration from clients.
+The script measures compressed size and compression time for all combinations of
+Window Bits and Memory Level. It outputs two tables with absolute values and two
+tables with values relative to websockets' default settings.
 
-When a client doesn't support configuring the size of its compression window,
-websockets enables compression with the largest possible decompression window.
-In most use cases, this is more efficient than disabling compression both ways.
+Pick your favorite settings in these tables and configure them as shown above.
 
-If you are very sensitive to memory usage, you can reverse this behavior by
-setting the ``require_client_max_window_bits`` parameter of
-:class:`ServerPerMessageDeflateFactory` to ``True``.
+.. _compression/benchmark.py: https://github.com/python-websockets/websockets/blob/main/experiments/compression/benchmark.py
 
-For clients
-...........
+Default settings for clients
+----------------------------
 
-By default, websockets enables compression with Memory Level = 5 but leaves
+By default, websockets enables compression with Memory Level = 5 but leaves
 the Window Bits setting up to the server.
 
-There's two good reasons and one bad reason for not optimizing the client side
-like the server side:
+There's two good reasons and one bad reason for not optimizing Window Bits on
+the client side as on the server side:
 
 1. If the maintainers of a server configured some optimized settings, we don't
    want to override them with more restrictive settings.
 
 2. Optimizing memory usage doesn't matter very much for clients because it's
    uncommon to open thousands of client connections in a program.
 
-3. On a more pragmatic note, some servers misbehave badly when a client
-   configures compression settings. `AWS API Gateway`_ is the worst offender.
+3. On a more pragmatic and annoying note, some servers misbehave badly when a
+   client configures compression settings. `AWS API Gateway`_ is the worst
+   offender.
 
    .. _AWS API Gateway: https://github.com/python-websockets/websockets/issues/1065
 
@@ -207,6 +203,29 @@ like the server side:
    Until the ecosystem levels up, interoperability with buggy servers seems
    more valuable than optimizing memory usage.
 
+Decompression
+-------------
+
+The discussion above focuses on compression because it's more expensive than
+decompression. Indeed, leaving aside small allocations, theoretical memory
+usage is:
+
+* ``(1 << (windowBits + 2)) + (1 << (memLevel + 9))`` for compression;
+* ``1 << windowBits`` for decompression.
+
+CPU usage is also higher for compression than decompression.
+
+While it's always possible for a server to use a smaller window size for
+compressing outgoing messages, using a smaller window size for decompressing
+incoming messages requires collaboration from clients.
+
+When a client doesn't support configuring the size of its compression window,
+websockets enables compression with the largest possible decompression window.
+In most use cases, this is more efficient than disabling compression both ways.
+
+If you are very sensitive to memory usage, you can reverse this behavior by
+setting the ``require_client_max_window_bits`` parameter of
+:class:`ServerPerMessageDeflateFactory` to ``True``.
 
 Further reading
 ---------------
@@ -216,7 +235,7 @@ settings affect memory usage and how to optimize them.
 
 .. _blog post by Ilya Grigorik: https://www.igvita.com/2013/11/27/configuring-and-optimizing-websocket-compression/
 
-This `experiment by Peter Thorson`_ recommends Window Bits = 11 and Memory
-Level = 4 for optimizing memory usage.
+This `experiment by Peter Thorson`_ recommends Window Bits = 11 and Memory
+Level = 4 for optimizing memory usage.
 
 .. _experiment by Peter Thorson: https://mailarchive.ietf.org/arch/msg/hybi/F9t4uPufVEy8KBLuL36cZjCmM_Y/
@@ -488,55 +488,6 @@ they're drained. That's why all APIs that write frames are asynchronous.
 Of course, it's still possible for an application to create its own unbounded
 buffers and break the backpressure. Be careful with queues.
 
-
-.. _buffers:
-
-Buffers
--------
-
-.. note::
-
-    This section discusses buffers from the perspective of a server but it
-    applies to clients as well.
-
-An asynchronous systems works best when its buffers are almost always empty.
-
-For example, if a client sends data too fast for a server, the queue of
-incoming messages will be constantly full. The server will always be 32
-messages (by default) behind the client. This consumes memory and increases
-latency for no good reason. The problem is called bufferbloat.
-
-If buffers are almost always full and that problem cannot be solved by adding
-capacity — typically because the system is bottlenecked by the output and
-constantly regulated by backpressure — reducing the size of buffers minimizes
-negative consequences.
-
-By default websockets has rather high limits. You can decrease them according
-to your application's characteristics.
-
-Bufferbloat can happen at every level in the stack where there is a buffer.
-For each connection, the receiving side contains these buffers:
-
-- OS buffers: tuning them is an advanced optimization.
-- :class:`~asyncio.StreamReader` bytes buffer: the default limit is 64 KiB.
-  You can set another limit by passing a ``read_limit`` keyword argument to
-  :func:`~client.connect()` or :func:`~server.serve`.
-- Incoming messages :class:`~collections.deque`: its size depends both on
-  the size and the number of messages it contains. By default the maximum
-  UTF-8 encoded size is 1 MiB and the maximum number is 32. In the worst case,
-  after UTF-8 decoding, a single message could take up to 4 MiB of memory and
-  the overall memory consumption could reach 128 MiB. You should adjust these
-  limits by setting the ``max_size`` and ``max_queue`` keyword arguments of
-  :func:`~client.connect()` or :func:`~server.serve` according to your
-  application's requirements.
-
-For each connection, the sending side contains these buffers:
-
-- :class:`~asyncio.StreamWriter` bytes buffer: the default size is 64 KiB.
-  You can set another limit by passing a ``write_limit`` keyword argument to
-  :func:`~client.connect()` or :func:`~server.serve`.
-- OS buffers: tuning them is an advanced optimization.
-
 Concurrency
 -----------