Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] #123

Closed
yomi1117 opened this issue Dec 31, 2024 · 4 comments
Closed

[Bug] #123

yomi1117 opened this issue Dec 31, 2024 · 4 comments

Comments

@yomi1117
Copy link

Environment

The Environment is right.

Describe the bug

OOM...I know video distillation needs many GPUS. I used 8 A800 to run the distillation training of Hunyuan, but I got OOM error. How many GPUs we need to run the distill...

Reproduction

None

@foreverpiano
Copy link
Collaborator

8 GPUs are enough. Can you share with your script?

@jzhang38
Copy link
Collaborator

jzhang38 commented Jan 1, 2025

I just updated the script for 8 GPUs and can you try it here:

# If you do not have 32 GPUs and to fit in memory, you can: 1. increase sp_size. 2. reduce num_latent_t

@yomi1117
Copy link
Author

yomi1117 commented Jan 1, 2025

Thank u for your help. I'm curious about how many GPUS and the config to train FastHunyuan~ Thank u !

@jzhang38
Copy link
Collaborator

jzhang38 commented Jan 2, 2025

This should be the exact command to reproduce FastHunyuan:

torchrun --nnodes 4 --nproc_per_node 8\

@rlsu9 rlsu9 closed this as completed Jan 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants