python-websockets
diff --git a/‎docs/topics/deployment.svg ‎docs/deploy/architecture.svg b/‎docs/topics/deployment.svg ‎docs/deploy/architecture.svg
diff --git a/‎docs/deploy/index.rst
+196-6 b/‎docs/deploy/index.rst
+196-6
diff --git a/‎docs/howto/index.rst
-2 b/‎docs/howto/index.rst
-2
diff --git a/‎docs/spelling_wordlist.txt
+1 b/‎docs/spelling_wordlist.txt
+1
@@ -1,11 +1,39 @@
-Deployment guides
-=================
+Deployment
+==========
 
-Discover how to deploy your application on various platforms.
+.. currentmodule:: websockets
 
-Platforms-as-a-Service
+Architecture decisions
 ----------------------
 
+When you deploy your websockets server to production, at a high level, your
+architecture will almost certainly look like the following diagram:
+
+.. image:: architecture.svg
+
+The basic unit for scaling a websockets server is "one server process". Each
+blue box in the diagram represents one server process.
+
+There's more variation in routing. While the routing layer is shown as one big
+box, it is likely to involve several subsystems.
+
+As a consequence, when you design a deployment, you must answer two questions:
+
+1. How will I run the appropriate number of server processes?
+2. How will I route incoming connections to these processes?
+
+These questions are interrelated. There's a wide range of valid answers,
+depending on your goals and your constraints.
+
+Platforms-as-a-Service
+......................
+
+Platforms-as-a-Service are the easiest option. They provide end-to-end,
+integrated solutions and they require little configuration.
+
+Here's how to deploy on some popular PaaS providers. Since all PaaS use
+similar patterns, the concepts translate to other providers.
+
 .. toctree::
    :titlesonly:
 
@@ -14,8 +42,13 @@ Platforms-as-a-Service
    fly
    heroku
 
-Self-hosted
------------
+Self-hosted infrastructure
+..........................
+
+If you need more control over your infrastructure, you can deploy on your own
+infrastructure. This requires more configuration.
+
+Here's how to configure some components mentioned in this guide.
 
 .. toctree::
    :titlesonly:
@@ -24,3 +57,160 @@ Self-hosted
    supervisor
    nginx
    haproxy
+
+Running server processes
+------------------------
+
+How many processes do I need?
+.............................
+
+Typically, one server process will manage a few hundreds or thousands
+connections, depending on the frequency of messages and the amount of work
+they require.
+
+CPU and memory usage increase with the number of connections to the server.
+
+Often CPU is the limiting factor. If a server process goes to 100% CPU, then
+you reached the limit. How much headroom you want to keep is up to you.
+
+Once you know how many connections a server process can manage and how many
+connections you need to handle, you can calculate how many processes to run.
+
+You can also automate this calculation by configuring an autoscaler to keep
+CPU usage or connection count within acceptable limits.
+
+.. admonition:: Don't scale with threads. Scale only with processes.
+    :class: tip
+
+    Threads don't make sense for a server built with :mod:`asyncio`.
+
+How do I run processes?
+.......................
+
+Most solutions for running multiple instances of a server process fall into
+one of these three buckets:
+
+1. Running N processes on a platform:
+
+   * a Kubernetes Deployment
+
+   * its equivalent on a Platform as a Service provider
+
+2. Running N servers:
+
+   * an AWS Auto Scaling group, a GCP Managed instance group, etc.
+
+   * a fixed set of long-lived servers
+
+3. Running N processes on a server:
+
+   * preferably via a process manager or supervisor
+
+Option 1 is easiest if you have access to such a platform. Option 2 usually
+combines with option 3.
+
+How do I start a process?
+.........................
+
+Run a Python program that invokes :func:`~asyncio.server.serve` or
+:func:`~asyncio.router.route`. That's it!
+
+Don't run an ASGI server such as Uvicorn, Hypercorn, or Daphne. They're
+alternatives to websockets, not complements.
+
+Don't run a WSGI server such as Gunicorn, Waitress, or mod_wsgi. They aren't
+designed to run WebSocket applications.
+
+Applications servers handle network connections and expose a Python API. You
+don't need one because websockets handles network connections directly.
+
+How do I stop a process?
+........................
+
+Process managers send the SIGTERM signal to terminate processes. Catch this
+signal and exit the server to ensure a graceful shutdown.
+
+Here's an example:
+
+.. literalinclude:: ../../example/faq/shutdown_server.py
+    :emphasize-lines: 14-16
+
+When exiting the context manager, :func:`~asyncio.server.serve` closes all
+connections with code 1001 (going away). As a consequence:
+
+* If the connection handler is awaiting
+  :meth:`~asyncio.server.ServerConnection.recv`, it receives a
+  :exc:`~exceptions.ConnectionClosedOK` exception. It can catch the exception
+  and clean up before exiting.
+
+* Otherwise, it should be waiting on
+  :meth:`~asyncio.server.ServerConnection.wait_closed`, so it can receive the
+  :exc:`~exceptions.ConnectionClosedOK` exception and exit.
+
+This example is easily adapted to handle other signals.
+
+If you override the default signal handler for SIGINT, which raises
+:exc:`KeyboardInterrupt`, be aware that you won't be able to interrupt a
+program with Ctrl-C anymore when it's stuck in a loop.
+
+Routing connections
+-------------------
+
+What does routing involve?
+..........................
+
+Since the routing layer is directly exposed to the Internet, it should provide
+appropriate protection against threats ranging from Internet background noise
+to targeted attacks.
+
+You should always secure WebSocket connections with TLS. Since the routing
+layer carries the public domain name, it should terminate TLS connections.
+
+Finally, it must route connections to the server processes, balancing new
+connections across them.
+
+How do I route connections?
+...........................
+
+Here are typical solutions for load balancing, matched to ways of running
+processes:
+
+1. If you're running on a platform, it comes with a routing layer:
+
+   * a Kubernetes Ingress and Service
+
+   * a service mesh: Istio, Consul, Linkerd, etc.
+
+   * the routing mesh of a Platform as a Service
+
+2. If you're running N servers, you may load balance with:
+
+   * a cloud load balancer: AWS Elastic Load Balancing, GCP Cloud Load
+     Balancing, etc.
+
+   * A software load balancer: HAProxy, NGINX, etc.
+
+3. If you're running N processes on a server, you may load balance with:
+
+   * A software load balancer: HAProxy, NGINX, etc.
+
+   * The operating system — all processes listen on the same port
+
+You may trust the load balancer to handle encryption and to provide security.
+You may add another layer in front of the load balancer for these purposes.
+
+There are many possibilities. Don't add layers that you don't need, though.
+
+How do I implement a health check?
+..................................
+
+Load balancers need a way to check whether server processes are up and running
+to avoid routing connections to a non-functional backend.
+
+websockets provide minimal support for responding to HTTP requests with the
+``process_request`` hook.
+
+Here's an example:
+
+.. literalinclude:: ../../example/faq/health_check_server.py
+    :emphasize-lines: 7-9,16
@@ -39,8 +39,6 @@ features, which websockets supports fully.
 
    extensions
 
-.. _deployment-howto:
-
 If you're integrating the Sans-I/O layer of websockets into a library, rather
 than building an application with websockets, follow this guide.
 
 
@@ -51,6 +51,7 @@ middleware
 mutex
 mypy
 nginx
+PaaS
 Paketo
 permessage
 pid
-Original file line number
+Diff line change
 mutex
 mypy
 nginx
 +PaaS
 Paketo
 permessage
 pid