Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make choice of aggregator dynamic at session level and additional aggregators (FedOpt) #498

Merged
merged 36 commits into from
Jan 25, 2024
Merged
Show file tree
Hide file tree
Changes from 29 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
357198a
Removed unused base class
Dec 14, 2023
f59e0bf
work in progress, fedadam
Dec 28, 2023
fa791cf
Fix race condition in docker-compose template
Dec 28, 2023
7a20a6b
Merge branch 'bugfix/GH-496' into feature/fedopt
Dec 28, 2023
9aaf22e
Working fedopt, sgd as server side optimizer
Dec 28, 2023
8cceef3
Working fedopt, sgd as server side optimizer
Dec 30, 2023
353dcf4
Simple notebook demonstrating use of API to run an experiment and to …
Jan 2, 2024
e01eb8e
Make it possible to configure the aggregator per session
Jan 2, 2024
be80527
code checks
Jan 2, 2024
846fafa
Set initial model_id in session config
Jan 11, 2024
be5051b
Merge branch 'master' into feature/fedopt
Jan 11, 2024
e8238b0
fedadam working for pytorch
Jan 19, 2024
94f23c4
Rename numpyarrayhelper to numpyhelper
Jan 21, 2024
a88e557
Updated helper interface with numerics primitives
Jan 22, 2024
9e5ad43
PyTorch models now use list of numpy ndarray as format
Jan 23, 2024
5868624
kerashelper and pytorchhelper consolidated into one numpyhelper
Jan 23, 2024
d214884
Cleaned a bit in examples and added documentation
Jan 23, 2024
6583fc6
removed non working healthcheck
Jan 23, 2024
dc9502b
codechecks
Jan 23, 2024
3df3036
add back inference entrypoint
Jan 23, 2024
21f8065
Update integration tests
Jan 23, 2024
6e0611b
codechecks
Jan 23, 2024
d16f671
Fix imports
Jan 23, 2024
33742e4
Removed unused arguments to combine_models
Jan 23, 2024
29212a6
Refactor helper module and update unit tests
Jan 23, 2024
6a8396d
Refactor helper module
Jan 24, 2024
db1eb74
codecheck
Jan 24, 2024
f000351
Improve aggrgator interface
Jan 24, 2024
d6049ea
codecheck
Jan 24, 2024
a35d1fe
Removed addition to fedn.yaml
Jan 24, 2024
847d6ba
Updated docstrings
Jan 24, 2024
bd3c411
Changed RoundControl to RoundHandler to avoid confusion with the glob…
Jan 24, 2024
5a80fcf
Clean up notebook
Jan 24, 2024
f70a7d5
Moved notebook into pytorch example folder
Jan 24, 2024
b1b57ae
Added notebooks in torch example folder to gitignore
Jan 24, 2024
109b62e
Updated docstrings
Jan 25, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/integration-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@ jobs:
strategy:
matrix:
to_test:
- "mnist-keras kerashelper"
- "mnist-pytorch pytorchhelper"
- "mnist-keras numpyhelper"
- "mnist-pytorch numpyhelper"
python_version: ["3.8", "3.9","3.10"]
os:
- ubuntu-20.04
Expand Down
5 changes: 5 additions & 0 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,8 @@ services:
- "/venv/bin/pip install --no-cache-dir -e /app/fedn && /venv/bin/fedn run combiner --init config/settings-combiner.yaml"
ports:
- 12080:12080
depends_on:
- api-server

# Client
client:
Expand All @@ -136,3 +138,6 @@ services:
- "/venv/bin/pip install --no-cache-dir -e /app/fedn && /venv/bin/fedn run client --init config/settings-client.yaml"
deploy:
replicas: 0
depends_on:
- api-server
- combiner
4 changes: 2 additions & 2 deletions docs/fedn.utils.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,10 @@ fedn.utils.dispatcher module
:undoc-members:
:show-inheritance:

fedn.utils.helpers module
fedn.utils.helpers.helpers module
-------------------------

.. automodule:: fedn.utils.helpers
.. automodule:: fedn.utils.helpers.helpers
:members:
:undoc-members:
:show-inheritance:
Expand Down
4 changes: 2 additions & 2 deletions docs/tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ A *entrypoint.py* example can look like this:
import fire
import torch

from fedn.utils.helpers import get_helper, save_metadata, save_metrics
from fedn.utils.helpers.helpers import get_helper, save_metadata, save_metrics

HELPER_MODULE = 'pytorchhelper'
NUM_CLASSES = 10
Expand Down Expand Up @@ -298,7 +298,7 @@ For validations it is a requirement that the output is saved in a valid json for

python entrypoint.py validate in_model_path out_json_path <extra-args>

In the code example we use the helper function :py:meth:`fedn.utils.helpers.save_metrics` to save the validation scores as a json file.
In the code example we use the helper function :py:meth:`fedn.utils.helpers.helpers.save_metrics` to save the validation scores as a json file.

The Dahboard in the FEDn UI will plot any scalar metric in this json file, but you can include any type in the file assuming that it is valid json. These values can then be obtained (by an athorized user) from the MongoDB database or using the :py:mod:`fedn.network.api.client`.

Expand Down
81 changes: 62 additions & 19 deletions examples/mnist-keras/client/entrypoint
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,11 @@ import fire
import numpy as np
import tensorflow as tf

from fedn.utils.helpers import get_helper, save_metadata, save_metrics
from fedn.utils.helpers.helpers import get_helper, save_metadata, save_metrics

HELPER_MODULE = 'numpyhelper'
helper = get_helper(HELPER_MODULE)

HELPER_MODULE = 'kerashelper'
NUM_CLASSES = 10


Expand All @@ -22,7 +24,16 @@ def _get_data_path():
return f"/var/data/clients/{number}/mnist.npz"


def _compile_model(img_rows=28, img_cols=28):
def compile_model(img_rows=28, img_cols=28):
""" Compile the TF model.

param: img_rows: The number of rows in the image
type: img_rows: int
param: img_cols: The number of rows in the image
type: img_cols: int
return: The compiled model
type: keras.model.Sequential
"""
# Set input shape
input_shape = (img_rows, img_cols, 1)

Expand All @@ -36,10 +47,11 @@ def _compile_model(img_rows=28, img_cols=28):
model.compile(loss=tf.keras.losses.categorical_crossentropy,
optimizer=tf.keras.optimizers.Adam(),
metrics=['accuracy'])

return model


def _load_data(data_path, is_train=True):
def load_data(data_path, is_train=True):
# Load data
if data_path is None:
data = np.load(_get_data_path())
Expand All @@ -63,46 +75,77 @@ def _load_data(data_path, is_train=True):


def init_seed(out_path='seed.npz'):
weights = _compile_model().get_weights()
helper = get_helper(HELPER_MODULE)
""" Initialize seed model and save it to file.

:param out_path: The path to save the seed model to.
:type out_path: str
"""
weights = compile_model().get_weights()
helper.save(weights, out_path)


def train(in_model_path, out_model_path, data_path=None, batch_size=32, epochs=1):
""" Complete a model update.

Load model paramters from in_model_path (managed by the FEDn client),
perform a model update, and write updated paramters
to out_model_path (picked up by the FEDn client).

:param in_model_path: The path to the input model.
:type in_model_path: str
:param out_model_path: The path to save the output model to.
:type out_model_path: str
:param data_path: The path to the data file.
:type data_path: str
:param batch_size: The batch size to use.
:type batch_size: int
:param epochs: The number of epochs to train.
:type epochs: int
"""
# Load data
x_train, y_train = _load_data(data_path)
x_train, y_train = load_data(data_path)

# Load model
model = _compile_model()
helper = get_helper(HELPER_MODULE)
model = compile_model()
weights = helper.load(in_model_path)
model.set_weights(weights)

# Train
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs)

# Save
weights = model.get_weights()
helper.save(weights, out_model_path)

# Metadata needed for aggregation server side
metadata = {
# num_examples are mandatory
'num_examples': len(x_train),
'batch_size': batch_size,
'epochs': epochs,
}

# Save JSON metadata file
# Save JSON metadata file (mandatory)
save_metadata(metadata, out_model_path)

# Save model update (mandatory)
weights = model.get_weights()
helper.save(weights, out_model_path)


def validate(in_model_path, out_json_path, data_path=None):
""" Validate model.

:param in_model_path: The path to the input model.
:type in_model_path: str
:param out_json_path: The path to save the output JSON to.
:type out_json_path: str
:param data_path: The path to the data file.
:type data_path: str
"""

# Load data
x_train, y_train = _load_data(data_path)
x_test, y_test = _load_data(data_path, is_train=False)
x_train, y_train = load_data(data_path)
x_test, y_test = load_data(data_path, is_train=False)

# Load model
model = _compile_model()
model = compile_model()
helper = get_helper(HELPER_MODULE)
weights = helper.load(in_model_path)
model.set_weights(weights)
Expand All @@ -127,10 +170,10 @@ def validate(in_model_path, out_json_path, data_path=None):

def infer(in_model_path, out_json_path, data_path=None):
# Using test data for inference but another dataset could be loaded
x_test, _ = _load_data(data_path, is_train=False)
x_test, _ = load_data(data_path, is_train=False)

# Load model
model = _compile_model()
model = compile_model()
helper = get_helper(HELPER_MODULE)
weights = helper.load(in_model_path)
model.set_weights(weights)
Expand Down
78 changes: 39 additions & 39 deletions examples/mnist-pytorch/client/entrypoint
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,11 @@ import docker
import fire
import torch

from fedn.utils.helpers import get_helper, save_metadata, save_metrics
from fedn.utils.helpers.helpers import get_helper, save_metadata, save_metrics

HELPER_MODULE = 'numpyhelper'
helper = get_helper(HELPER_MODULE)

HELPER_MODULE = 'pytorchhelper'
NUM_CLASSES = 10


Expand All @@ -24,7 +26,7 @@ def _get_data_path():
return f"/var/data/clients/{number}/mnist.pt"


def _compile_model():
def compile_model():
""" Compile the pytorch model.

:return: The compiled model.
Expand All @@ -44,12 +46,11 @@ def _compile_model():
x = torch.nn.functional.log_softmax(self.fc3(x), dim=1)
return x

# Return model
return Net()


def _load_data(data_path, is_train=True):
""" Load data from disk.
def load_data(data_path, is_train=True):
""" Load data from disk.

:param data_path: Path to data file.
:type data_path: str
Expand All @@ -76,54 +77,52 @@ def _load_data(data_path, is_train=True):
return X, y


def _save_model(model, out_path):
""" Save model to disk.
def save_parameters(model, out_path):
""" Save model paramters to file.

:param model: The model to save.
:param model: The model to serialize.
:type model: torch.nn.Module
:param out_path: The path to save to.
:type out_path: str
"""
weights = model.state_dict()
weights_np = collections.OrderedDict()
for w in weights:
weights_np[w] = weights[w].cpu().detach().numpy()
helper = get_helper(HELPER_MODULE)
helper.save(weights, out_path)
parameters_np = [val.cpu().numpy() for _, val in model.state_dict().items()]
helper.save(parameters_np, out_path)


def _load_model(model_path):
""" Load model from disk.
def load_parameters(model_path):
""" Load model parameters from file and populate model.

param model_path: The path to load from.
:type model_path: str
:return: The loaded model.
:rtype: torch.nn.Module
"""
helper = get_helper(HELPER_MODULE)
weights_np = helper.load(model_path)
weights = collections.OrderedDict()
for w in weights_np:
weights[w] = torch.tensor(weights_np[w])
model = _compile_model()
model.load_state_dict(weights)
model.eval()
model = compile_model()
parameters_np = helper.load(model_path)

params_dict = zip(model.state_dict().keys(), parameters_np)
state_dict = collections.OrderedDict({key: torch.tensor(x) for key, x in params_dict})
model.load_state_dict(state_dict, strict=True)
return model


def init_seed(out_path='seed.npz'):
""" Initialize seed model.
""" Initialize seed model and save it to file.

:param out_path: The path to save the seed model to.
:type out_path: str
"""
# Init and save
model = _compile_model()
_save_model(model, out_path)
model = compile_model()
save_parameters(model, out_path)


def train(in_model_path, out_model_path, data_path=None, batch_size=32, epochs=1, lr=0.01):
""" Train model.
""" Complete a model update.

Load model paramters from in_model_path (managed by the FEDn client),
perform a model update, and write updated paramters
to out_model_path (picked up by the FEDn client).

:param in_model_path: The path to the input model.
:type in_model_path: str
Expand All @@ -139,10 +138,10 @@ def train(in_model_path, out_model_path, data_path=None, batch_size=32, epochs=1
:type lr: float
"""
# Load data
x_train, y_train = _load_data(data_path)
x_train, y_train = load_data(data_path)

# Load model
model = _load_model(in_model_path)
# Load parmeters and initialize model
model = load_parameters(in_model_path)

# Train
optimizer = torch.optim.SGD(model.parameters(), lr=lr)
Expand All @@ -166,17 +165,18 @@ def train(in_model_path, out_model_path, data_path=None, batch_size=32, epochs=1

# Metadata needed for aggregation server side
metadata = {
# num_examples are mandatory
'num_examples': len(x_train),
'batch_size': batch_size,
'epochs': epochs,
'lr': lr
}

# Save JSON metadata file
# Save JSON metadata file (mandatory)
save_metadata(metadata, out_model_path)

# Save model update
_save_model(model, out_model_path)
# Save model update (mandatory)
save_parameters(model, out_model_path)


def validate(in_model_path, out_json_path, data_path=None):
Expand All @@ -190,11 +190,12 @@ def validate(in_model_path, out_json_path, data_path=None):
:type data_path: str
"""
# Load data
x_train, y_train = _load_data(data_path)
x_test, y_test = _load_data(data_path, is_train=False)
x_train, y_train = load_data(data_path)
x_test, y_test = load_data(data_path, is_train=False)

# Load model
model = _load_model(in_model_path)
model = load_parameters(in_model_path)
model.eval()

# Evaluate
criterion = torch.nn.NLLLoss()
Expand Down Expand Up @@ -225,5 +226,4 @@ if __name__ == '__main__':
'init_seed': init_seed,
'train': train,
'validate': validate,
# '_get_data_path': _get_data_path, # for testing
})
3 changes: 2 additions & 1 deletion examples/mnist-pytorch/client/fedn.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,5 @@ entry_points:
train:
command: /venv/bin/python entrypoint train $ENTRYPOINT_OPTS
validate:
command: /venv/bin/python entrypoint validate $ENTRYPOINT_OPTS
command: /venv/bin/python entrypoint validate $ENTRYPOINT_OPTS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it sneakt in. I was thinking about a new way to specify the helper. Will remove for now.

helper: pytorchhelper
Loading
Loading