Add doc about FST-based CTC forced alignment. #1482

csukuangfj · 2024-01-30T11:35:20Z

It is based on CTC FORCED ALIGNMENT API TUTORIAL from torchaudio, but we are using
an FST-based approach.

I can produce identical output with torchaudio using https://github.com/k2-fsa/kaldi-decoder.

I am refactoring the code and will prepare at least two colab notebooks.

whaozl · 2024-02-07T07:53:50Z

Hi, The align tool can make the word time stamp is accurate on the begin and end postion ?

csukuangfj · 2024-02-07T11:34:57Z

Hi, The align tool can make the word time stamp is accurate on the begin and end postion ?

It depends on what model you use.

You can have a look at https://pytorch.org/audio/main/tutorials/ctc_forced_alignment_api_tutorial.html
We can produce identical results with torchaudio using the same model.

lifeiteng · 2024-06-10T16:05:31Z

@csukuangfj Will it be completed soon?

csukuangfj · 2024-06-12T07:34:34Z

@csukuangfj Will it be completed soon?

Yes. I am working on it now.

lifeiteng · 2024-06-12T11:52:54Z

@csukuangfj Has k2-based approach been forgot?

csukuangfj · 2024-06-12T11:54:52Z

No, it is TODO.
Please use the first approach at present or you can add the second approach with k2 by yourself.
All APIs you need are there. You only need to combine them.

cageyoko · 2024-09-29T08:47:10Z

@csukuangfj Sorry to bother you, and look forward to your reply. Recently I compared several different alignment methods, such as TorchAudio(ctc), whisperX and funasr, and found that none of them were as good as Kaldi-based alignment. The conclusion is consistent with this paper "https://www.isca-archive.org/interspeech_2024/rousso24_interspeech.pdf", do you have some advice or todo in alignment in k2-fas project.

danpovey · 2024-09-29T10:48:32Z

Kaldi's TDNN systems have limited context, which may give better alignment. (Or GMMs may give even more precise alignments as they have even less context). I'm not sure that alignment is a super big priority in k2-fsa right now, is there a specific type of application you have in mind?

cageyoko · 2024-09-29T11:13:54Z

Thanks to Daniel for solving a problem I have had for a long time (less context is more important for alignment?). Yes. I have been working in spoken pronunciation scoring, which relies heavily on the accuracy of phoneme or word alignment.

WIP: Add doc about FST-based CTC forced alignment.

0a24446

csukuangfj added 2 commits June 12, 2024 15:38

minor fixes

68e202c

Finish kaldi-based approach

f515aed

csukuangfj changed the title ~~WIP: Add doc about FST-based CTC forced alignment.~~ Add doc about FST-based CTC forced alignment. Jun 12, 2024

Merge remote-tracking branch 'dan/master' into doc-force-alignment-kaldi

cb21b87

csukuangfj merged commit ec0389a into k2-fsa:master Jun 12, 2024
8 checks passed

csukuangfj deleted the doc-force-alignment-kaldi branch June 12, 2024 09:37

csukuangfj mentioned this pull request Jun 12, 2024

Add merge_tokens for ctc forced alignment #1649

Merged

yfyeung pushed a commit to yfyeung/icefall that referenced this pull request Aug 9, 2024

Add doc about FST-based CTC forced alignment. (k2-fsa#1482)

e765d1c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add doc about FST-based CTC forced alignment. #1482

Add doc about FST-based CTC forced alignment. #1482

csukuangfj commented Jan 30, 2024

whaozl commented Feb 7, 2024

csukuangfj commented Feb 7, 2024

lifeiteng commented Jun 10, 2024

csukuangfj commented Jun 12, 2024

lifeiteng commented Jun 12, 2024

csukuangfj commented Jun 12, 2024

cageyoko commented Sep 29, 2024

danpovey commented Sep 29, 2024

cageyoko commented Sep 29, 2024

Add doc about FST-based CTC forced alignment. #1482

Add doc about FST-based CTC forced alignment. #1482

Conversation

csukuangfj commented Jan 30, 2024

whaozl commented Feb 7, 2024

csukuangfj commented Feb 7, 2024

lifeiteng commented Jun 10, 2024

csukuangfj commented Jun 12, 2024

lifeiteng commented Jun 12, 2024

csukuangfj commented Jun 12, 2024

cageyoko commented Sep 29, 2024

danpovey commented Sep 29, 2024

cageyoko commented Sep 29, 2024