-
Select Topic AreaQuestion BodyHey, I am working on an LLM and recently implemented checkpointing in my models. Because of this, I have files that are too large to push (at the moment small but 100-400 MB). I am aware of LFS, however, what other solutions exist? If any are considered better? Especially given that I am in the training phase and will constantly update the .pth files. Many Thanks |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 3 replies
-
Thats Great Mate |
Beta Was this translation helpful? Give feedback.
-
Cloud Storage + Links: You could store your .pth files on a cloud storage service like Google Drive, AWS S3, or Azure, and then add links to them in your repo. DVC (Data Version Control): If you're looking for something more integrated, DVC is a great option. It’s designed for handling bigger datasets and model files while keeping version control in sync with Git. It can store your checkpoints remotely (cloud, S3, etc.) and manage them through Git without committing the actual files. Since you're in the training phase, DVC might be ideal since it tracks different versions of your files and works well with Git. Either that or stick to LFS. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
Cloud Storage + Links: You could store your .pth files on a cloud storage service like Google Drive, AWS S3, or Azure, and then add links to them in your repo.
DVC (Data Version Control): If you're looking for something more integrated, DVC is a great option. It’s designed for handling bigger datasets and model files while keeping version control in sync with Git. It can store your checkpoints remotely (cloud, S3, etc.) and manage them through Git without committing the actual files.
Since you're in the training phase, DVC might be ideal since it tracks different versions of your files and works well with Git. Either that or stick to LFS.