Generate llama instead of downloading it #250

satyaog · 2024-08-13T04:37:02Z

No description provided.

benchmarks/llm/prepare.py

satyaog · 2024-08-14T22:48:47Z

This should work for 8B and 70B llama3 models. To test it tho there's 2 points where I would need tips:

Where can I find the resources (8 GPUs) to test the 70B llama3? Do we have a dedicated machine on the mila cluster?
70B llama3 benches are all using *.safetensors whereas the generated files are original/*.pth files. Are safetensors needed? How could I change the config to use the original/*.pth instead?

Delaunay · 2024-08-15T20:53:09Z

We could try to use in the config to use the original checkpoints.

checkpointer:
  _component_: torchtune.utils.FullModelMetaCheckpointer

instead

checkpointer:
  _component_: torchtune.utils.FullModelHFCheckpointer

Delaunay · 2024-08-20T14:52:08Z

Could you try to see if we are able to load the original/*.pth on 4 GPUs or if it requires 8GPUs ?

satyaog · 2024-08-20T17:05:24Z

Sure I'll try but with the safetensors as unfortunatly torchtune is not compatible with multiple *.pth for now :

Currently we support reading from a single checkpoint file only. Support for reading from
sharded checkpoints is WIP.

https://github.com/pytorch/torchtune/blob/main/torchtune/utils/_checkpointing/_checkpointer.py#L650-L656

But I think it should work as there's an option to offload on CPU :

fsdp_cpu_offload: True

https://github.com/mila-iqia/milabench/pull/250/files#diff-1072365d60b45fad1a39661bb5e7a99bc355fc167d08476e83f8b97484eecd6aL98

satyaog · 2024-08-20T20:15:50Z

Tested with 8 GPUs and with 4 GPUs and ~600G RAM ~~testing right now with 4 GPUs and ~600G RAM. So far so good with 18/30 steps completed~~. Both are working. So llama3 8B will generate/load original/*.pth model format of the model and the 70B will generate/load safetensors model format

satyaog · 2024-08-20T20:18:10Z

Would you use the cloud-ci to test this?

* rename huggingface token to MILABENCH_* to automatically forward the env var to a remote in such cases

Delaunay · 2024-09-09T19:09:55Z

benchmarks/llm/configs/llama3_70B_full.yaml

-    model-00028-of-00030.safetensors,
-    model-00029-of-00030.safetensors,
-    model-00030-of-00030.safetensors,
+    model-00001-of-00062.safetensors,


Cela serait bien si les poids étaient interchangeables

Qu'on fasse la génération du fichier .yaml après avoir généré les fichiers .safetensors?

que la génération génère 30 safetensors pour qu'on puisse changer entre pretrained & generated sans problème

Done here : #278

satyaog commented Aug 13, 2024

View reviewed changes

benchmarks/llm/prepare.py Outdated Show resolved Hide resolved

satyaog force-pushed the feature/generate_llama branch 2 times, most recently from cc8a7c3 to de3ffda Compare August 14, 2024 21:51

satyaog force-pushed the feature/generate_llama branch from 42f3aed to 6da9faa Compare August 20, 2024 13:50

Delaunay mentioned this pull request Aug 20, 2024

Milabench Size #254

Open

satyaog force-pushed the feature/generate_llama branch from 6da9faa to eca81da Compare August 20, 2024 20:13

satyaog force-pushed the feature/generate_llama branch from 8b35c3d to 282e9c2 Compare August 27, 2024 17:55

satyaog added 3 commits September 5, 2024 10:29

Generate llama instead of downloading it

cc018b1

Generate safe checkpoints for llama3 70B

0b1320a

Fix llm requirements

39222b8

* rename huggingface token to MILABENCH_* to automatically forward the env var to a remote in such cases

satyaog force-pushed the feature/generate_llama branch from 282e9c2 to 39222b8 Compare September 5, 2024 14:30

Delaunay changed the base branch from master to staging September 5, 2024 15:28

Delaunay merged commit ea44ea6 into staging Sep 5, 2024
3 of 6 checks passed

Delaunay reviewed Sep 9, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate llama instead of downloading it #250

Generate llama instead of downloading it #250

satyaog commented Aug 13, 2024

satyaog commented Aug 14, 2024 •

edited

Loading

Delaunay commented Aug 15, 2024

Delaunay commented Aug 20, 2024

satyaog commented Aug 20, 2024

satyaog commented Aug 20, 2024 •

edited

Loading

satyaog commented Aug 20, 2024

Delaunay Sep 9, 2024

satyaog Sep 9, 2024

Delaunay Sep 9, 2024

satyaog Sep 10, 2024

Generate llama instead of downloading it #250

Generate llama instead of downloading it #250

Conversation

satyaog commented Aug 13, 2024

satyaog commented Aug 14, 2024 • edited Loading

Delaunay commented Aug 15, 2024

Delaunay commented Aug 20, 2024

satyaog commented Aug 20, 2024

satyaog commented Aug 20, 2024 • edited Loading

satyaog commented Aug 20, 2024

Delaunay Sep 9, 2024

Choose a reason for hiding this comment

satyaog Sep 9, 2024

Choose a reason for hiding this comment

Delaunay Sep 9, 2024

Choose a reason for hiding this comment

satyaog Sep 10, 2024

Choose a reason for hiding this comment

satyaog commented Aug 14, 2024 •

edited

Loading

satyaog commented Aug 20, 2024 •

edited

Loading