Feature/add fedavg metric optrimization controller #3506

rbagan · 2025-05-22T22:06:11Z

Fixes # .

Description

Hi all,

This pull request adds a new controller: FedAvg Metric Optimizaiton. This controller is the result of a joint effort between Roche and Universitätsspital Zurich (USZ) to train a model in a federated manner.

Purpose of the controller:
The goal is to obtain the best possible model by optimizing a specific metric (e.g., minimizing the loss or maximizing the F-Score) and to stop the training if the tracked metric does not improve after a certain number of FL rounds, as defined by the researcher. This approach saves computation time during FL training, especially when the model is large and requires a significant amount of data. This controller has been developed in a paper that is currently under peer review.

Additionally, we wanted to provide the option to choose whether to optimize the metric during training or validation.

I would like to highlight that this contribution is thanks to Roche and the Universitätsspital Zurich (USZ).

Best,
Lydia Anette Schönpflug (USZ) and Ruben Bagan Benavides (Roche)

Types of changes

Non-breaking change (fix or new feature that would not break existing functionality).
Breaking change (fix or new feature that would cause existing functionality to change).
New tests added to cover the changes.
Quick tests passed locally by running ./runtest.sh.
In-line docstrings updated.
Documentation updated.

…roller

chesterxgchen

@rbagan
Lydia Anette Schönpflug (USZ) and Ruben Bagan Benavides (Roche)
thank you so much for this contribution. Always love the community to contribute.

There are a couple of issues with this PR:

the files needed to formatted in certain way to pass the unit tests.
what I usefully do is the following

./runtest.sh -f

which will fix most of formats for me.

then I run

./runtest.sh -s

to check if anything else needs to be fixed.

Behind the scene is basically calls black-check, isort-check etc.

or you can simply run

./runtest.sh

Which will run all the unit test to make sure it passes (which will check the format first)

The proposed the PR is very similar to this controller. fedavg_early_stopping.py
Which based on the condition expressed such as: "accuracy > 0.8"
https://github.com/NVIDIA/NVFlare/blob/main/nvflare/app_opt/pt/fedavg_early_stopping.py
The client script for the this controller is in the example: https://github.com/NVIDIA/NVFlare/blob/main/examples/hello-world/hello-fedavg/pt_fedavg_early_stopping_script.py

Please see if your work has additional improvement beyond the fedavg_early_stopping controller.

holgerroth · 2025-06-02T20:20:34Z

examples/advanced/metric_optrimization/README.md

+    - `task_validation_name`: specifies the name of the validation task
+    - `task_to_optimize`: indicates whether to apply metric optimization to the training or validation task
+    - `patience`: defines the number of FL rounds to wait without improvement before stopping the training
+* Model Selection: As and alternative to using a IntimeModelSelector componenet for model selection, we instead compare the metrics of the models in the workflow to select the best model each round.


typo: componenet -> component

holgerroth · 2025-06-02T20:21:21Z

nvflare/app_opt/pt/fedavg_metric_optimization.py

@@ -0,0 +1,284 @@
+# Copyright (c) 2024, NVIDIA CORPORATION.  All rights reserved.


Please use 2025 in the license files.

holgerroth · 2025-06-02T20:31:06Z

examples/advanced/metric_optrimization/pt_fedavg_metric_optimization_script.py

+from src.net import Net
+
+from nvflare import FedJob
+from fedavg_metric_optimization import PTFedAvgMetricOptimization


This should be from nvflare.app_opt.pt. fedavg_metric_optimization import PTFedAvgMetricOptimization

holgerroth · 2025-06-02T20:37:36Z

examples/advanced/metric_optrimization/src/cifar_fl.py

+# (optional) set a fix place so we don't need to download everytime
+CIFAR10_ROOT = "data/cifar10"
+# (optional) We change to use GPU to speed things up.
+# if you want to use CPU, change DEVICE="cpu"


We typically use this so the code automatically works with CPU or GPU

# If available, we use GPU to speed things up. DEVICE = "cuda" if torch.cuda.is_available() else "CPU"

rbagan added 2 commits May 22, 2025 23:46

add a simple tutorial how to use FedAvg Metric Optimization controller

db85861

add the FedAvg Metric Optimization controller

c7c4f88

chesterxgchen requested review from holgerroth and ZiyueXu77 May 31, 2025 00:38

Merge branch 'main' into feature/add_fedavg_metric_optrimization_cont…

f8efb38

…roller

chesterxgchen reviewed May 31, 2025

View reviewed changes

holgerroth reviewed Jun 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/add fedavg metric optrimization controller #3506

Feature/add fedavg metric optrimization controller #3506

Uh oh!

rbagan commented May 22, 2025 •

edited

Loading

Uh oh!

chesterxgchen left a comment •

edited

Loading

Uh oh!

holgerroth Jun 2, 2025

Uh oh!

holgerroth Jun 2, 2025

Uh oh!

holgerroth Jun 2, 2025

Uh oh!

holgerroth Jun 2, 2025

Uh oh!

Uh oh!

		@@ -0,0 +1,284 @@
		# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.

Feature/add fedavg metric optrimization controller #3506

Are you sure you want to change the base?

Feature/add fedavg metric optrimization controller #3506

Uh oh!

Conversation

rbagan commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Types of changes

Uh oh!

chesterxgchen left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

holgerroth Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

holgerroth Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

holgerroth Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

holgerroth Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rbagan commented May 22, 2025 •

edited

Loading

chesterxgchen left a comment •

edited

Loading