This repo builds based on the code and models of Audio MAE.
- This repo follows the MAE repo, Installation and preparation follow that repo.
- Copy files and patch the timm package by ``bash timm_patch.sh'' (Please change the path to your own timm package path).
- Please find mae_env.yml for all the dependencies.
- You may also use download the conda-packed conda env, untar it, and then:
conda env create -f mae_env.yaml
Please download AudioSet at here. Due to copyright we cannot release the data. The data annotation json parased and used in this work is available here. The format follows the one in AST. Please be sure to modify the path in the scripts accordingly to reflect your own setup.
For the brave ones to pre-train on AudioSet-2M: Please use the pretrain_audioset2M.sh by:
bash pretrain_audioset2M.sh
bash ft_srun_pathway.sh