Add zipformer recipe for audio tagging #1421

marcoyang1998 · 2023-12-19T09:24:21Z

This PR adds a recipe for audio tagging using Zipformer. We use AudioSet as the training set.

Without using knowledge distillation, the Zipformer-M model achieves mAP value of 45.1. Using knowledge distillation could further improve the results (will add support for this in another PR)

Add usage example
Upload the pre-trained model

…tagging

marcoyang1998 · 2024-03-20T10:43:53Z

You can use the pretrained model to predict the audio events of an audio clip through the following command:

python zipformer/pretrained.py \
    --checkpoint zipformer/exp_at_as_full/pretrained.pt \
    --label-dict downloads/audioset/class_labels_indices.csv \
    downloads/audioset/eval/wav/__p-iA312kg_70.000_80.000.wav

You should see the following output:

2024-03-20 18:40:35,611 INFO [pretrained.py:175] Reading sound files: ['downloads/audioset/eval/wav/__p-iA312kg_70.000_80.000.wav']
2024-03-20 18:40:35,616 INFO [pretrained.py:181] Decoding started
Top 5 predicted labels of the 0 th audio are ['Cat', 'Animal', 'Domestic animals, pets', 'Meow', 'Caterwaul'] with probability of [0.9497291445732117, 0.8830551505088806, 0.8787704110145569, 0.2819702923297882, 0.18947316706180573]

marcoyang1998 · 2024-03-21T02:45:06Z

The pretrained model can be found here: https://huggingface.co/marcoyang/icefall-audio-tagging-audioset-zipformer-2024-03-12#/

egs/audioset/AT/local/generate_audioset_manifest.py

egs/audioset/AT/zipformer/at_datamodule.py

egs/audioset/AT/local/generate_audioset_manifest.py

egs/audioset/AT/zipformer/at_datamodule.py

egs/audioset/AT/zipformer/train.py

marcoyang1998 · 2024-03-26T07:04:54Z

Since there is no official way of downloading AudioSet now, this recipe assumes that you have your own version of AudioSet. The required structure of the dataset will be given in prepare.sh (will add in the next commit).

egs/audioset/AT/zipformer/train.py

egs/audioset/AT/zipformer/evaluate.py

egs/audioset/AT/zipformer/export.py

marcoyang1998 · 2024-03-29T09:32:21Z

Added jit_pretrained.py. The script produces the same results as pretrained.py.

…g, but the results might be affected by the padding

marcoyang1998 · 2024-04-09T03:29:29Z

@csukuangfj Can we merge this now?

egs/audioset/AT/RESULTS.md

csukuangfj

Thanks! Just left a minor comment.

marcoyang1998 added 11 commits December 19, 2023 15:14

initial commit

baa03c7

add datamodule for audioset

a1aca34

minor fix

bf58b63

add softlink

57ff00d

add evaluation script

bd01c21

update the manifest

3e22108

Merge branch 'master' of github.com:marcoyang1998/icefall into audio_…

1279355

…tagging

add export.py

4e14800

support exporting the pretrained model

219d55d

add file

1921692

add inference script with a pretrained model

9c4db1b

csukuangfj reviewed Mar 26, 2024

View reviewed changes

egs/audioset/AT/local/generate_audioset_manifest.py Outdated Show resolved Hide resolved

marcoyang1998 added 3 commits March 26, 2024 10:24

fix style

4bce81b

Merge remote-tracking branch 'origin' into audio_tagging

18479fc

fix style

7a8c9b7

csukuangfj reviewed Mar 26, 2024

View reviewed changes

egs/audioset/AT/local/generate_audioset_manifest.py Show resolved Hide resolved

csukuangfj requested changes Mar 26, 2024

View reviewed changes

egs/audioset/AT/local/generate_audioset_manifest.py Outdated Show resolved Hide resolved

egs/audioset/AT/local/generate_audioset_manifest.py Outdated Show resolved Hide resolved

egs/audioset/AT/local/generate_audioset_manifest.py Outdated Show resolved Hide resolved

csukuangfj requested changes Mar 26, 2024

View reviewed changes

enhance documentation

f4c1872

minor changes

64dbcd0

csukuangfj reviewed Mar 26, 2024

View reviewed changes

egs/audioset/AT/zipformer/train.py Outdated Show resolved Hide resolved

csukuangfj requested changes Mar 26, 2024

View reviewed changes

egs/audioset/AT/zipformer/train.py Outdated Show resolved Hide resolved

egs/audioset/AT/zipformer/train.py Outdated Show resolved Hide resolved

fix doc

8b234b3

csukuangfj requested changes Mar 26, 2024

View reviewed changes

marcoyang1998 added 2 commits March 29, 2024 17:07

fix the comments; wrap the classifier for jit script

a8ca029

add a file to test jit script model

2d1072f

marcoyang1998 added 4 commits March 29, 2024 17:08

minor updates

6a7ac68

update comments in evaluate.py

5a4b712

minor updates

9e9bc75

add readme and results

39e7de4

marcoyang1998 added 9 commits March 29, 2024 18:14

support export onnx model

ff2975d

add onnx pretrained

7bd679f

minor updates

686d2d9

fix style

f3e8e42

support onnx export with batch size 1; also works for batch processin…

01b744f

…g, but the results might be affected by the padding

update the script to generate audioset manfiest

25d22d9

add prepare.sh

ff484be

add missing files

1ca4646

update comments

864914f

csukuangfj reviewed Apr 9, 2024

View reviewed changes

egs/audioset/AT/RESULTS.md Show resolved Hide resolved

csukuangfj approved these changes Apr 9, 2024

View reviewed changes

add link to audioset

b134889

marcoyang1998 merged commit 1732daf into k2-fsa:master Apr 9, 2024
7 checks passed

This was referenced Apr 9, 2024

Add CI test for the AudioSet recipe. #1585

Merged

Support audio tagging using zipformer k2-fsa/sherpa-onnx#747

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add zipformer recipe for audio tagging #1421

Add zipformer recipe for audio tagging #1421

marcoyang1998 commented Dec 19, 2023 •

edited

Loading

marcoyang1998 commented Mar 20, 2024

marcoyang1998 commented Mar 21, 2024

marcoyang1998 commented Mar 26, 2024

marcoyang1998 commented Mar 29, 2024

marcoyang1998 commented Apr 9, 2024

csukuangfj left a comment

Add zipformer recipe for audio tagging #1421

Add zipformer recipe for audio tagging #1421

Conversation

marcoyang1998 commented Dec 19, 2023 • edited Loading

marcoyang1998 commented Mar 20, 2024

marcoyang1998 commented Mar 21, 2024

marcoyang1998 commented Mar 26, 2024

marcoyang1998 commented Mar 29, 2024

marcoyang1998 commented Apr 9, 2024

csukuangfj left a comment

Choose a reason for hiding this comment

marcoyang1998 commented Dec 19, 2023 •

edited

Loading