Skip to content

Pytorch implementations of some general federated optimization methods.

Notifications You must be signed in to change notification settings

woodenchild95/FL-Simulator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FL-Simulator

Pytorch implementations of some general optimization methods in the federated learning community.

Basic Methods

FedAvg: Communication-Efficient Learning of Deep Networks from Decentralized Data

FedProx: Federated Optimization in Heterogeneous Networks

FedAdam: Adaptive Federated Optimization

SCAFFOLD: SCAFFOLD: Stochastic Controlled Averaging for Federated Learning

FedDyn: Federated Learning Based on Dynamic Regularization

FedCM: FedCM: Federated Learning with Client-level Momentum

FedSAM/MoFedSAM: Generalized Federated Learning via Sharpness Aware Minimization

FedGamma: Fedgamma: Federated learning with global sharpness-aware minimization

FedSpeed: FedSpeed: Larger Local Interval, Less Communication Round, and Higher Generalization Accuracy

FedSMOO: Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape

Usage

Training

FL-Simulator works on one single CPU/GPU to simulate the training process of federated learning (FL) with the PyTorch framework. If you want to train the centralized-FL with FedAvg method on the ResNet-18 and Cifar-10 dataset (10% active clients per round of total 100 clients, and heterogeneous dataset split is Dirichlet-0.6), you can use:

python train.py --non-iid --dataset CIFAR-10 --model ResNet18 --split-rule Dirichlet --split-coef 0.6 --active-ratio 0.1 --total-client 100

Other hyperparameters are introduced in the train.py file.

How to implement your own method?

FL-Simulator pre-define the basic Server class and Client class, which are executed according to the vanilla $FedAvg$ algorithm. If you want define a new method, you can define a new server file first with:

  • process_for_communication( ): how your method pre-processes the variables for communication to each client

  • postprocess( ): how your method post-processes the received variables from each local client

  • global_update( ): how your method processes the update on the global model

Then you can define a new client file or a new local optimizer for your own method to perform the local training. Similarly, you can directly define a new server class to rebuild the inner-operations.

Some Experiments

We show some results of the ResNet-18-GN model on the CIFAR-10 dataset. The corresponding hyperparameters are stated in the following. The time costs are tested on the NVIDIA® Tesla® V100 Tensor Core.

CIFAR-10 (ResNet-18-GN) T=1000
10%-100 (bs=50 Local-epoch=5) 5%-200 (bs=25 Local-epoch=5)
IID Dir-0.6 Dir-0.3 Dir-0.1 Time / round IID Dir-0.6 Dir-0.3 Dir-0.1 Time / round
SGD basis
FedAvg 82.52 80.65 79.75 77.31 15.86s 81.09 79.93 78.66 75.21 17.03s
FedProx 82.54 81.05 79.52 76.86 19.78s 81.56 79.49 78.76 75.84 20.97s
FedAdam 84.32 82.56 82.12 77.58 15.91s 83.29 81.22 80.22 75.83 17.67s
SCAFFOLD 84.88 83.53 82.75 79.92 20.09s 84.24 83.01 82.04 78.23 22.21s
FedDyn 85.46 84.22 83.22 78.96 20.82s 81.11 80.25 79.43 75.43 22.68s
FedCM 85.74 83.81 83.44 78.92 20.74s 83.77 82.01 80.77 75.91 21.24s
SAM basis
FedGamma 85.74 84.80 83.81 80.72 30.13s 84.99 84.02 83.03 80.09 33.63s
MoFedSAM 87.24 85.74 85.14 81.58 29.06s 86.27 84.71 83.44 79.02 32.45s
FedSpeed 87.31 86.33 85.39 82.26 29.48s 86.87 85.07 83.94 79.66 33.69s
FedSMOO 87.70 86.87 86.04 83.30 30.43s 87.40 85.97 85.14 81.35 34.80s

The blank parts are awaiting updates.

Some key hyparameters selection

local Lr global Lr Lr decay SAM Lr proxy coefficient client-momentum coefficiet
FedAvg 0.1 1.0 0.998 - - -
FedProx 0.1 1.0 0.998 - 0.1 / 0.01 -
FedAdam 0.1 0.1 / 0.05 0.998 - - -
SCAFFOLD 0.1 1.0 0.998 - - -
FedDyn 0.1 1.0 0.9995 / 1.0 - 0.1 -
FedCM 0.1 1.0 0.998 - - 0.1
FedGamma 0.1 1.0 0.998 0.01 - -
MoFedSAM 0.1 1.0 0.998 0.1 - 0.05 / 0.1
FedSpeed 0.1 1.0 0.998 0.1 0.1 -
FedSMOO 0.1 1.0 0.998 0.1 0.1 -

The hyperparameter selections above are for reference only. Each algorithm has unique properties to match the corresponding hyperparameters. In order to facilitate a relatively fair comparison, we report a set of selections that each method can perform well in general cases. Please adjust the hyperparameters according to changes in the different model backbones and datasets.

ToDo

  • Decentralized Implementation
  • Delayed / Asynchronous Implementation
  • Hyperparameter Selections
  • Related Advances (Long-Term)

Citation

If this codebase can help you, please cite our papers:

FedSpeed (ICLR 2023):

@article{sun2023fedspeed,
  title={Fedspeed: Larger local interval, less communication round, and higher generalization accuracy},
  author={Sun, Yan and Shen, Li and Huang, Tiansheng and Ding, Liang and Tao, Dacheng},
  journal={arXiv preprint arXiv:2302.10429},
  year={2023}
}

FedSMOO (ICML 2023 Oral):

@inproceedings{sun2023dynamic,
  title={Dynamic regularized sharpness aware minimization in federated learning: Approaching global consistency and smooth landscape},
  author={Sun, Yan and Shen, Li and Chen, Shixiang and Ding, Liang and Tao, Dacheng},
  booktitle={International Conference on Machine Learning},
  pages={32991--33013},
  year={2023},
  organization={PMLR}
}

About

Pytorch implementations of some general federated optimization methods.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages