Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable ddp find_unused_parameterse #111

Merged
merged 2 commits into from
Nov 13, 2024
Merged

enable ddp find_unused_parameterse #111

merged 2 commits into from
Nov 13, 2024

Conversation

yurujaja
Copy link
Collaborator

No description provided.

@yurujaja yurujaja requested a review from VMarsocci October 27, 2024 21:53
@LeungTsang
Copy link
Collaborator

The DDP takes the input and passes it to the local model, and then analyzes the output from the local model if find_unused_parameters is set to True. This mode allows running backward on a subgraph of the model, and DDP finds out which parameters are involved in the backward pass by traversing the autograd graph from the model output and marking all unused parameters as ready for reduction. During the backward pass, the Reducer would only wait for unready parameters, but it would still reduce all buckets. Marking a parameter gradient as ready does not help DDP skip buckets as for now, but it will prevent DDP from waiting for absent gradients forever during the backward pass. Note that traversing the autograd graph introduces extra overheads, so applications should only set find_unused_parameters to True when necessary.

Copy link
Owner

@VMarsocci VMarsocci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should add find_unused_parameters, just when finetune is on, thanks

@VMarsocci VMarsocci merged commit 9a795ad into main Nov 13, 2024
1 check passed
@yurujaja yurujaja deleted the fix-ddp branch November 19, 2024 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants