diff --git a/training/training.md b/training/training.md index 2919cef..fde1551 100644 --- a/training/training.md +++ b/training/training.md @@ -1,3 +1,5 @@ +# Training + We trained our Bamba model with FSDP using our training repo [here](https://github.com/foundation-model-stack/fms-fsdp/tree/mamba-new). Note that this training effort was started before FSDP2 and also long before we contributed `Mamba2-Hybrid` to HF, so we were doing FSDP1 training with [official Mamba implementation](https://github.com/state-spaces/mamba).