Move quickstart to the mnist-pytorch example, to minimize risk for in…

…cosistencies
scaleoutsystems · Feb 1, 2024 · 1aa9e38 · 1aa9e38
1 parent ca14b38
commit 1aa9e38
Show file tree

Hide file tree

Showing 2 changed files with 105 additions and 128 deletions.
diff --git a/README.rst b/README.rst
@@ -17,13 +17,13 @@ development on your laptop to real-world production setups in geographically dis
 Core Features
 =============
 
--  **Scalable and resilient.** FEDn is highly scalable and resilient via a tiered 
-   architecture where multiple aggregation servers (combiners) form a network to divide up the work to coordinate clients and aggregate models. 
+-  **Scalable and resilient.** FEDn is scalable and resilient via a tiered 
+   architecture where multiple aggregation servers (combiners) divide up the work to coordinate clients and aggregate models. 
    Benchmarks show high performance both for thousands of clients in a cross-device
    setting and for large model updates in a cross-silo setting. 
    FEDn has the ability to recover from failure in all critical components. 
 
--  **Security**. A key feature is that
+-  **Security**. FEDn is built using secure industry standard communication protocols (gRPC). A key feature is that
    clients do not have to expose any ingress ports. 
 
 -  **Track events and training progress in real-time**. FEDn tracks events for clients and aggregation servers, logging to MongoDB. This
@@ -39,86 +39,13 @@ Core Features
    ML model type or framework. Support for Keras and PyTorch is
    available out-of-the-box.
 
+
 Getting started
 ===============
 
-Prerequisites
--------------
-
--  `Python 3.8, 3.9 or 3.10 <https://www.python.org/downloads>`__
--  `Docker <https://docs.docker.com/get-docker>`__
--  `Docker Compose <https://docs.docker.com/compose/install>`__
-
-Quick start
------------
-
-Clone this repository, locate into it and start a pseudo-distributed FEDn network using docker-compose:
-
-.. code-block::
-
-   docker-compose up 
-
-This starts up the needed backend services MongoDB and Minio, the API Server and one Combiner. You can verify deployment using these urls: 
-
-- API Server: localhost:8092
-- Minio: localhost:9000
-- Mongo Express: localhost:8081
-
-Next, we will prepare the client. A key concept in FEDn is the compute package - 
-a code bundle that contains entrypoints for training and (optionally) validating a model update on the client. 
-The following steps uses the compute package defined in the example project 'examples/mnist-pytorch'.
-
-Locate into 'examples/mnist-pytorch' and familiarize yourself with the project structure. The entrypoints
-are defined in 'client/entrypoint'. The dependencies needed in the client environment are specified in 
-'requirements.txt'. For convenience, we have provided utility scripts to set up a virtual environment.    
-
-Start by initializing a virtual enviroment with all of the required dependencies for this project.
-
-.. code-block::
-
-   bin/init_venv.sh
-
-Next create the compute package and a seed model:
-
-.. code-block::
-
-   bin/build.sh
-
-Uploade the generated files 'package.tgz' and 'seed.npz' using the API: 
+The best way to get started is to take the quickstart tutorial: 
 
-The next step is to configure and attach clients. For this we download data and make data partitions: 
-
-Download the data:
-
-.. code-block::
-
-   bin/get_data
-
-
-Split the data in 2 partitions:
-
-.. code-block::
-
-   bin/split_data
-
-Data partitions will be generated in the folder 'data/clients'.  
-
-Now navigate to http://localhost:8090/network and download the client config file. Place it in the example working directory.  
-
-To connect a client that uses the data partition 'data/clients/1/mnist.pt': 
-
-.. code-block::
-
-   docker run \
-  -v $PWD/client.yaml:/app/client.yaml \
-  -v $PWD/data/clients/1:/var/data \
-  -e ENTRYPOINT_OPTS=--data_path=/var/data/mnist.pt \
-  --network=fedn_default \
-  ghcr.io/scaleoutsystems/fedn/fedn:master-mnist-pytorch run client -in client.yaml --name client1 
-
-You are now ready to start training the model at http://localhost:8090/control.
-
-To scale up the experiment, refer to the README at 'examples/mnist-pytorch' (or the corresponding Keras version), where we explain how to use docker-compose to automate deployment of several clients.  
+- `Quickstart PyTorch <https://fedn.readthedocs.io>`__
 
 Documentation
 =============
@@ -128,6 +55,13 @@ You will find more details about the architecture, compute package and how to de
 -  `Paper <https://arxiv.org/abs/2103.00148>`__
 
 
+FEDn Studio
+===============
+Scaleout develops a Django Application, FEDn Studio, that provides a UI, authentication/authorization, client identity management, project-based multitenancy for manging multiple projects, and integration with your MLOps pipelines.
+There are also additional tooling and charts for deployments on Kubernetes including integration with several projects from the cloud native landscape. See  `FEDn Framework <https://www.scaleoutsystems.com/framework>`__ 
+for more information. 
+
+
 Making contributions
 ====================
 

diff --git a/examples/mnist-pytorch/README.md b/examples/mnist-pytorch/README.md
@@ -1,69 +1,112 @@
 # MNIST (PyTorch version)
 This classic example of hand-written text recognition is well suited as a lightweight test when developing on FEDn in pseudo-distributed mode. A normal high-end laptop or a workstation should be able to sustain a few clients. The example automates the partitioning of data and deployment of a variable number of clients on a single host. We here assume working experience with containers, Docker and docker-compose. 
 
+Prerequisites
+-------------
 
-## Table of Contents
-- [MNIST Example (PyTorch version)](#mnist-example-pytorch-version)
-  - [Table of Contents](#table-of-contents)
-  - [Prerequisites](#prerequisites)
-  - [Running the example (pseudo-distributed)](#running-the-example-pseudo-distributed)
-  - [Clean up](#clean-up)
+-  `Python 3.8, 3.9 or 3.10 <https://www.python.org/downloads>`__
+-  `Docker <https://docs.docker.com/get-docker>`__
+-  `Docker Compose <https://docs.docker.com/compose/install>`__
 
-## Prerequisites
-- [Python 3.8, 3.9 or 3.10](https://www.python.org/downloads)
-- [Docker](https://docs.docker.com/get-docker)
-- [Docker Compose](https://docs.docker.com/compose/install)
+Quick start
+-----------
 
-## Running the example (pseudo-distributed, single host)
+Clone this repository, locate into it and start a pseudo-distributed FEDn network using docker-compose:
 
-Clone FEDn and locate into this directory.
-```sh
-git clone https://github.com/scaleoutsystems/fedn.git
-cd fedn/examples/mnist-pytorch
-```
+.. code-block::
 
-### Preparing the environment, the local data, the compute package and seed model
-Start by initializing a virtual enviroment with all of the required dependencies.
-```
-bin/init_venv.sh
-```
+   docker-compose up 
 
-Then, to get the data you can run the following script.
-```
-bin/get_data
-```
+This starts up the needed backend services MongoDB and Minio, the API Server and one Combiner. 
+You can verify the deployment using these urls: 
 
-The next command splits the data in 2 parts for the clients.
-```
-bin/split_data
-```
-> **Note**: run with `--n_splits=N` to split in *N* parts.
+- API Server: http://localhost:8092
+- Minio: http://localhost:9000
+- Mongo Express: localhost:8081
 
-Create the compute package and a seed model that you will be asked to upload in the next step.
-```
-bin/build.sh
-```
-> The files location will be `package/package.tgz` and `seed.npz`.
+Next, we will prepare the client. A key concept in FEDn is the compute package - 
+a code bundle that contains entrypoints for training and (optionally) validating a model update on the client. 
 
-### Deploy FEDn 
-Now we are ready to deploy FEDn with `docker-compose`.
-```
-docker-compose -f ../../docker-compose.yaml up -d minio mongo mongo-express reducer combiner
-```
+Locate into 'examples/mnist-pytorch' and familiarize yourself with the project structure. The entrypoints
+are defined in 'client/entrypoint'. The dependencies needed in the client environment are specified in 
+'requirements.txt'. For convenience, we have provided utility scripts to set up a virtual environment.    
+
+Start by initializing a virtual enviroment with all of the required dependencies for this project.
+
+.. code-block::
+
+   bin/init_venv.sh
+
+Next create the compute package and a seed model:
+
+.. code-block::
+
+   bin/build.sh
+
+You should now have a file 'package.tgz' and 'seed.npz' in the project folder. 
+
+Next we prepare the local dataset. For this we download MNIST data and make data partitions: 
+
+Download the data:
 
-### Initialize the federated model 
-Now navigate to http://localhost:8090 to see the reducer UI. You will be asked to upload the compute package and the seed model that you created in the previous step. Make sure to choose the "PyTorch" helper.
+.. code-block::
 
-### Attach clients 
-To attach clients to the network, use the docker-compose.override.yaml template to start 2 clients: 
+   bin/get_data
+
+
+Split the data in 2 partitions:
+
+.. code-block::
+
+   bin/split_data
+
+Data partitions will be generated in the folder 'data/clients'.  
+
+FEDn relies on a configuration file for the client to connect to the server. Create a file called 'client.yaml' with the follwing content:
+
+.. code-block::
+
+   network_id: fedn-network
+   discover_host: api-server
+   discover_port: 8092
+
+Now we are ready to connect a clients. First start a client using the data partition 'data/clients/1/mnist.pt':
+
+.. code-block::
+
+   docker run \
+  -v $PWD/client.yaml:/app/client.yaml \
+  -v $PWD/data/clients/1:/var/data \
+  -e ENTRYPOINT_OPTS=--data_path=/var/data/mnist.pt \
+  --network=fedn_default \
+  ghcr.io/scaleoutsystems/fedn/fedn:master-mnist-pytorch run client -in client.yaml --name client1
+
+Observe the API Server logs and combiner logs, you should see the client connecting and entering into a state asking for a compute package. 
+
+In a seprate terminal, start a second client using the data partition 'data/clients/2/mnist.pt':
+
+.. code-block::
+
+   docker run \
+  -v $PWD/client.yaml:/app/client.yaml \
+  -v $PWD/data/clients/2:/var/data \
+  -e ENTRYPOINT_OPTS=--data_path=/var/data/mnist.pt \
+  --network=fedn_default \
+  ghcr.io/scaleoutsystems/fedn/fedn:master-mnist-pytorch run client -in client.yaml --name client2
+
+You are now ready to use the API to initialize the system with the compute package and seed model, and to start federated training. 
+
+- Follow the example in the `Jupyter Notebook <https://github.com/scaleoutsystems/fedn/blob/master/examples/mnist-pytorch/API_Example.ipynb>`__
+
+
+### Automate experimentation with several clients:  
+
+Now that you have an understanding of the main components of FEDn, you can use the provided docker-compose templates to automate deployment of FEDn and clients. 
+To start the network and attach 4 clients: 
 
 ```
-docker-compose -f ../../docker-compose.yaml -f docker-compose.override.yaml up client 
+docker-compose -f ../../docker-compose.yaml -f docker-compose.override.yaml up --scale client=4 
 ```
-> **Note**: run with `--scale client=N` to start *N* clients.
-
-### Run federated training 
-Finally, you can start the experiment from the "control" tab of the UI. 
 
 ## Clean up
 You can clean up by running `docker-compose down`.