Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LoRA & p-tuning with multi-GPU #22

Open
haozhouamzn opened this issue Oct 19, 2023 · 3 comments
Open

LoRA & p-tuning with multi-GPU #22

haozhouamzn opened this issue Oct 19, 2023 · 3 comments

Comments

@haozhouamzn
Copy link

Hi, in table 20, it shows prefix FT with 2 and 4 GPUs. How are those obtained? I tried using MODEL=facebook/opt-13b TASK=SST2 MODE=prefix LR=1e-5 NUM_GPU=8 bash finetune_fsdp.sh, but got some errors.

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:5! (when checking argument for argument index in method wrapper_CUDA__index_select)
@gaotianyu1350
Copy link
Member

Hi,

You do not need to use fsdp for mezo multi-gpu, since mezo only requires model inference. You should be able to directly run it with mezo.sh (same as what the README instructed for single GPU without any code/script change). Just make sure there are 2 available GPUs.

@haozhouamzn
Copy link
Author

Thanks, yes, MeZO works out-of-box.

How about first-order prefix FT (Prefix FT column in table 20)? The results on 13B, 30B, and 66B used FSDP, right?

@gaotianyu1350
Copy link
Member

gaotianyu1350 commented Oct 25, 2023

Yes, and you should be able to run them via the following command (from readme):

# Full-parameter fine-tuning using fully-sharded data parallel or FSDP (multi-GPU)
MODEL=facebook/opt-13b TASK=SST2 MODE=ft LR=1e-5 NUM_GPU=4 bash finetune_fsdp.sh

You can change the MODE to prefix or lora

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants