LoRA & p-tuning with multi-GPU #22

haozhouamzn · 2023-10-19T01:07:53Z

Hi, in table 20, it shows prefix FT with 2 and 4 GPUs. How are those obtained? I tried using MODEL=facebook/opt-13b TASK=SST2 MODE=prefix LR=1e-5 NUM_GPU=8 bash finetune_fsdp.sh, but got some errors.

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:5! (when checking argument for argument index in method wrapper_CUDA__index_select)

The text was updated successfully, but these errors were encountered:

gaotianyu1350 · 2023-10-19T20:00:53Z

Hi,

You do not need to use fsdp for mezo multi-gpu, since mezo only requires model inference. You should be able to directly run it with mezo.sh (same as what the README instructed for single GPU without any code/script change). Just make sure there are 2 available GPUs.

haozhouamzn · 2023-10-19T20:06:03Z

Thanks, yes, MeZO works out-of-box.

How about first-order prefix FT (Prefix FT column in table 20)? The results on 13B, 30B, and 66B used FSDP, right?

gaotianyu1350 · 2023-10-25T12:53:54Z

Yes, and you should be able to run them via the following command (from readme):

# Full-parameter fine-tuning using fully-sharded data parallel or FSDP (multi-GPU)
MODEL=facebook/opt-13b TASK=SST2 MODE=ft LR=1e-5 NUM_GPU=4 bash finetune_fsdp.sh

You can change the MODE to prefix or lora

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LoRA & p-tuning with multi-GPU #22

LoRA & p-tuning with multi-GPU #22

haozhouamzn commented Oct 19, 2023

gaotianyu1350 commented Oct 19, 2023

haozhouamzn commented Oct 19, 2023

gaotianyu1350 commented Oct 25, 2023 •

edited

Loading

LoRA & p-tuning with multi-GPU #22

LoRA & p-tuning with multi-GPU #22

Comments

haozhouamzn commented Oct 19, 2023

gaotianyu1350 commented Oct 19, 2023

haozhouamzn commented Oct 19, 2023

gaotianyu1350 commented Oct 25, 2023 • edited Loading

gaotianyu1350 commented Oct 25, 2023 •

edited

Loading