-
Notifications
You must be signed in to change notification settings - Fork 250
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
2c9c11d
commit e392571
Showing
5 changed files
with
59 additions
and
20 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,3 @@ | ||
.. _introduction: | ||
|
||
.. _`Kaldi homepage`: http://kaldi-asr.org/ | ||
|
||
|
@@ -16,16 +15,51 @@ | |
|
||
.. _`EasyAlign homepage`: http://latlcui.unige.ch/phonetique/easyalign.php | ||
|
||
.. _`@wavable`: https://twitter.com/wavable | ||
|
||
.. _`Github`: http://mmcauliffe.github.io/ | ||
|
||
.. _introduction: | ||
|
||
************ | ||
Introduction | ||
============ | ||
************ | ||
|
||
What is forced alignment? | ||
------------------------- | ||
========================= | ||
|
||
Forced alignment is a technique to take an orthographic transcription of | ||
an audio file and generate a time-aligned version using a pronunciation | ||
dictionary to look up phones for words. | ||
|
||
|
||
Montreal Forced Aligner | ||
======================= | ||
|
||
Pipeline of training | ||
-------------------- | ||
|
||
The Montreal Forced Aligner goes through three stages of training. The | ||
first pass of alignment uses monophone models, where each phone is modelled | ||
the same regardless of phonological context. The second pass uses triphone | ||
models, where context on either side of a phone is taken into account for | ||
acoustic models. The final pass enhances the triphone model by taking | ||
into account speaker differences, and calculates a transformation of the | ||
mel frequency cepstrum coefficients (MFCC) features for each speaker. | ||
|
||
Use of speaker information | ||
-------------------------- | ||
|
||
A key feature of the Montreal Forced Aligner is the use of speaker | ||
adaptatation in alignment. The command line interface provides multiple | ||
ways of grouping audio files by speaker, depending on the input file format | ||
(either :ref:`prosodylab_format` or :ref:`textgrid_format`). | ||
In addition to speaker-adaptation in the final pass of alignment, speaker | ||
information is used for grouping audio files together for multiprocessing | ||
and ceptstral mean and variance normalization (CMVN). If speakers are not | ||
properly specified, then feature calculation might not succeed due to | ||
limits on the numbers of files open. | ||
|
||
Underlying technology | ||
--------------------- | ||
|
||
|
@@ -34,8 +68,8 @@ The Montreal Forced Aligner uses the Kaldi ASR toolkit | |
Kaldi is under active development and uses modern ASR and includes state-of-the-art algorithms for tasks | ||
in automatic speech recognition beyond forced alignment. | ||
|
||
Relation to other forced alignment tools | ||
---------------------------------------- | ||
Other forced alignment tools | ||
============================ | ||
|
||
Most tools for forced alignment used by linguists rely on the HMM Toolkit | ||
(HTK; `HTK homepage`_), including: | ||
|
@@ -45,29 +79,29 @@ Most tools for forced alignment used by linguists rely on the HMM Toolkit | |
* FAVE-align (`FAVE-align homepage`_) | ||
* (Web) MAUS(`MAUS homepage`_) | ||
|
||
Praat (`Praat homepage`_) | ||
has a built-in aligner as well. | ||
EasyAlign (`EasyAlign homepage`_) | ||
is a Praat plug-in built to facilitate its use. | ||
|
||
|
||
Praat (`Praat homepage`_) has a built-in aligner as well. | ||
EasyAlign (`EasyAlign homepage`_) is a Praat plug-in built to facilitate its use. | ||
|
||
Montreal Forced Aligner is most similar to the Prosodylab-aligner, and | ||
was developed at the same lab. Because the Montreal Forced Aligner uses | ||
a different toolkit to do alignment, trained models cannot be used with | ||
the Prosodylab-aligner, and vice versa. | ||
|
||
Contributors | ||
------------ | ||
============ | ||
|
||
* Michael McAuliffe | ||
* Michael McAuliffe ([email protected], `Github`_, `@wavable`_) | ||
* Michaela Socolof | ||
* Sarah Mihuc | ||
* Michael Wagner | ||
|
||
Citation | ||
-------- | ||
======== | ||
|
||
McAuliffe, Michael, Michaela Socolof, Sarah Mihuc, and Michael Wagner (2016). | ||
Montreal Forced Aligner [Computer program]. Version 0.5, | ||
retrieved 13 July 2016 from http://montrealcorpustools.github.io/Montreal-Forced-Aligner/. | ||
|
||
Funding | ||
------- | ||
======= | ||
|