GitHub

Installation

Please refer to https://icefall.readthedocs.io/en/latest/installation/index.html for installation.

Recipes

Please refer to https://icefall.readthedocs.io/en/latest/recipes/index.html for more information.

We provide 6 recipes at present:

yesno

This is the simplest ASR recipe in icefall and can be run on CPU. Training takes less than 30 seconds and gives you the following WER:

[test_set] %WER 0.42% [1 / 240, 0 ins, 1 del, 0 sub ]

We do provide a Colab notebook for this recipe.

LibriSpeech

Please see https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/RESULTS.md for the latest results.

We provide 4 models for this recipe:

Conformer CTC Model

The best WER we currently have is:

	test-clean	test-other
WER	2.42	5.73

We provide a Colab notebook to run a pre-trained conformer CTC model:

TDNN LSTM CTC Model

The WER for this model is:

	test-clean	test-other
WER	6.59	17.69

We provide a Colab notebook to run a pre-trained TDNN LSTM CTC model:

Transducer: Conformer encoder + LSTM decoder

Using Conformer as encoder and LSTM as decoder.

The best WER with greedy search is:

	test-clean	test-other
WER	3.07	7.51

We provide a Colab notebook to run a pre-trained RNN-T conformer model:

Transducer: Conformer encoder + Embedding decoder

Using Conformer as encoder. The decoder consists of 1 embedding layer and 1 convolutional layer.

The best WER using modified beam search with beam size 4 is:

	test-clean	test-other
WER	2.56	6.27

Note: No auxiliary losses are used in the training and no LMs are used in the decoding.

We provide a Colab notebook to run a pre-trained transducer conformer + stateless decoder model:

k2 pruned RNN-T

	test-clean	test-other
WER	2.57	5.95

k2 pruned RNN-T + GigaSpeech

	test-clean	test-other
WER	2.00	4.63

Aishell

We provide two models for this recipe: conformer CTC model and TDNN LSTM CTC model.

Conformer CTC Model

The best CER we currently have is:

	test
CER	4.26

We provide a Colab notebook to run a pre-trained conformer CTC model:

Transducer Stateless Model

The best CER we currently have is:

	test
CER	4.68

We provide a Colab notebook to run a pre-trained TransducerStateless model:

TDNN LSTM CTC Model

The CER for this model is:

	test
CER	10.16

We provide a Colab notebook to run a pre-trained TDNN LSTM CTC model:

TIMIT

We provide two models for this recipe: TDNN LSTM CTC model and TDNN LiGRU CTC model.

TDNN LSTM CTC Model

The best PER we currently have is:

	TEST
PER	19.71%

We provide a Colab notebook to run a pre-trained TDNN LSTM CTC model:

TDNN LiGRU CTC Model

The PER for this model is:

	TEST
PER	17.66%

We provide a Colab notebook to run a pre-trained TDNN LiGRU CTC model:

TED-LIUM3

We provide two models for this recipe: Transducer Stateless: Conformer encoder + Embedding decoder and Pruned Transducer Stateless: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss.

Transducer Stateless: Conformer encoder + Embedding decoder

The best WER using modified beam search with beam size 4 is:

	dev	test
WER	6.91	6.33

Note: No auxiliary losses are used in the training and no LMs are used in the decoding.

We provide a Colab notebook to run a pre-trained Transducer Stateless model:

Pruned Transducer Stateless: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss

The best WER using modified beam search with beam size 4 is:

	dev	test
WER	6.77	6.14

We provide a Colab notebook to run a pre-trained Pruned Transducer Stateless model:

GigaSpeech

We provide two models for this recipe: Conformer CTC model and Pruned stateless RNN-T: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss.

Conformer CTC

	Dev	Test
WER	10.47	10.58

Pruned stateless RNN-T: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss

	Dev	Test
greedy search	10.51	10.73
fast beam search	10.50	10.69
modified beam search	10.40	10.51

Deployment with C++

Once you have trained a model in icefall, you may want to deploy it with C++, without Python dependencies.

Please refer to the documentation https://icefall.readthedocs.io/en/latest/recipes/librispeech/conformer_ctc.html#deployment-with-c for how to do this.

We also provide a Colab notebook, showing you how to run a torch scripted model in k2 with C++. Please see:

Name		Name	Last commit message	Last commit date
Latest commit History 452 Commits
.github		.github
docker		docker
docs		docs
egs		egs
icefall		icefall
test		test
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
contributing.md		contributing.md
pyproject.toml		pyproject.toml
requirements-ci.txt		requirements-ci.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

Recipes

yesno

LibriSpeech

Conformer CTC Model

TDNN LSTM CTC Model

Transducer: Conformer encoder + LSTM decoder

Transducer: Conformer encoder + Embedding decoder

k2 pruned RNN-T

k2 pruned RNN-T + GigaSpeech

Aishell

Conformer CTC Model

Transducer Stateless Model

TDNN LSTM CTC Model

TIMIT

TDNN LSTM CTC Model

TDNN LiGRU CTC Model

TED-LIUM3

Transducer Stateless: Conformer encoder + Embedding decoder

Pruned Transducer Stateless: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss

GigaSpeech

Conformer CTC

Pruned stateless RNN-T: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss

Deployment with C++

About

Releases

Packages

Languages

License

goodatlas/icefall

Folders and files

Latest commit

History

Repository files navigation

Installation

Recipes

yesno

LibriSpeech

Conformer CTC Model

TDNN LSTM CTC Model

Transducer: Conformer encoder + LSTM decoder

Transducer: Conformer encoder + Embedding decoder

k2 pruned RNN-T

k2 pruned RNN-T + GigaSpeech

Aishell

Conformer CTC Model

Transducer Stateless Model

TDNN LSTM CTC Model

TIMIT

TDNN LSTM CTC Model

TDNN LiGRU CTC Model

TED-LIUM3

Transducer Stateless: Conformer encoder + Embedding decoder

Pruned Transducer Stateless: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss

GigaSpeech

Conformer CTC

Pruned stateless RNN-T: Conformer encoder + Embedding decoder + k2 pruned RNN-T loss

Deployment with C++

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages