Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for non-i.i.d. data regime in OpenDiLoCo experiments #37

Open
gaseln opened this issue Nov 26, 2024 · 1 comment
Open

Support for non-i.i.d. data regime in OpenDiLoCo experiments #37

gaseln opened this issue Nov 26, 2024 · 1 comment

Comments

@gaseln
Copy link

gaseln commented Nov 26, 2024

Hello again!

The DiLoCo paper by Arthur Douillard et al. explores the non-i.i.d. data regime in comparison to data parallelism. Could you kindly confirm if OpenDiLoCo supports this setup? If it does, could you please provide guidance on how to configure such an experiment? If not, would you recommend an efficient way to organize it in a similar manner to the approach described in the paper?

Thank you!

@Jackmin801
Copy link
Member

So it seems in the original paper they did non-i.i.d by doing k-mean clustering on last layer features from a model. Not sure if they disclose what this model is but I imagine it doesnt matter too much and if i were to guess they probably used bert.
image

You can then have the different workers load different datasets. Each one loading a different cluster split from the k-means

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants