Question about implementation of gradient compression strategies #18447

shuo-ouyang · 2020-05-31T02:01:22Z

shuo-ouyang
May 31, 2020

Hi guys, I am currently studying gradient compression for distributed communication reduction. I found that some strategies have been implemented in MXNet, such as Signum and 2bit quantization. However, such methods are implemented at different levels. Signum is an optimizer while 2bit quantization only works in kvstore. So I am confused what is a best way to implement these gradient compression strategies? How do we determine where (python frontend or kvstore or others) to implement these?

Answered by szhengac

Sep 5, 2020

Recently, we provide a gradient compression api based on BytePS. See bytedance/byteps#225 and https://github.com/bytedance/byteps/blob/master/docs/gradient-compression.md. Currently, it supports 1-bit, top-k, random-k, and dithering. The example of training on ImageNet with mxnet is provided in https://github.com/bytedance/byteps/blob/master/example/mxnet/train_gluon_imagenet_byteps_gc.py

View full answer

shuo-ouyang · 2020-05-31T02:03:01Z

shuo-ouyang
May 31, 2020
Author

@mxnet-label-bot add [Question]

0 replies

szhengac · 2020-09-05T21:14:19Z

szhengac
Sep 5, 2020

Recently, we provide a gradient compression api based on BytePS. See bytedance/byteps#225 and https://github.com/bytedance/byteps/blob/master/docs/gradient-compression.md. Currently, it supports 1-bit, top-k, random-k, and dithering. The example of training on ImageNet with mxnet is provided in https://github.com/bytedance/byteps/blob/master/example/mxnet/train_gluon_imagenet_byteps_gc.py

1 reply

shuo-ouyang Sep 11, 2020
Author

Thanks very much for your response. It looks like that byteps#225 is a really good design and implementation of various gradient compression strategies.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about implementation of gradient compression strategies #18447

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

Question about implementation of gradient compression strategies #18447

shuo-ouyang May 31, 2020

Replies: 2 comments · 1 reply

shuo-ouyang May 31, 2020 Author

szhengac Sep 5, 2020

shuo-ouyang Sep 11, 2020 Author

shuo-ouyang
May 31, 2020

Replies: 2 comments 1 reply

shuo-ouyang
May 31, 2020
Author

szhengac
Sep 5, 2020

shuo-ouyang Sep 11, 2020
Author