Skip to content

Commit

Permalink
Some doc updates
Browse files Browse the repository at this point in the history
  • Loading branch information
mmcauliffe committed Jul 11, 2016
1 parent 2c9c11d commit e392571
Show file tree
Hide file tree
Showing 5 changed files with 59 additions and 20 deletions.
2 changes: 1 addition & 1 deletion docs/source/aligning.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ Align using pretrained models
The Montreal Forced Aligner comes with pretrained models/dictionaries for:

- English - trained from the LibriSpeech data set (`LibriSpeech corpus`_)
- Quebec French
- Quebec French - coming soon

Command template:

Expand Down
8 changes: 6 additions & 2 deletions docs/source/data_format.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@
Data formats
************

Prosodylab-Aligner format
.. _prosodylab_format:

Prosodylab-aligner format
=========================

Every .wav sound file you are aligning must have a corresponding .lab
Expand Down Expand Up @@ -35,6 +37,8 @@ for words and a tier for phones.

<<PICTURE OF OUTPUT TEXTGRID - ALA A LIBRISPEECH UTTERANCE>>

.. _textgrid_format:

TextGrid format
===============

Expand Down Expand Up @@ -76,7 +80,7 @@ be replaced in the output with '<unk>' for unknown word.
the unknown words per utterance.
As part of parsing orthographic transcriptions, punctuation is stripped
from the ends of words. In addition, all words are converted to lowercase
from the ends of words. In addition, all words are converted to lowercase
so that dictionary lookup is not case-sensitive.

Dictionary lookup will attempt to generate the most maximal coverage of
Expand Down
2 changes: 1 addition & 1 deletion docs/source/dictionary.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
.. _dictionary:

.. _`LibriSpeech lexicon`: http://www.openslr.org/resources/11/librispeech-lexicon.txt

Expand All @@ -8,6 +7,7 @@

.. _`Prosodylab-aligner French dictionary`: https://github.com/prosodylab/prosodylab-alignermodels/blob/master/FrenchQuEu/fr-QuEu.dict

.. _dictionary:

************
Dictionaries
Expand Down
3 changes: 2 additions & 1 deletion docs/source/installation.rst
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
.. _installation:

.. _`Montreal Forced Aligner releases`: https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/releases

.. _`Kaldi GitHub repository`: https://github.com/kaldi-asr/kaldi

.. _installation:

************
Installation
************
Expand Down
64 changes: 49 additions & 15 deletions docs/source/introduction.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
.. _introduction:

.. _`Kaldi homepage`: http://kaldi-asr.org/

Expand All @@ -16,16 +15,51 @@

.. _`EasyAlign homepage`: http://latlcui.unige.ch/phonetique/easyalign.php

.. _`@wavable`: https://twitter.com/wavable

.. _`Github`: http://mmcauliffe.github.io/

.. _introduction:

************
Introduction
============
************

What is forced alignment?
-------------------------
=========================

Forced alignment is a technique to take an orthographic transcription of
an audio file and generate a time-aligned version using a pronunciation
dictionary to look up phones for words.


Montreal Forced Aligner
=======================

Pipeline of training
--------------------

The Montreal Forced Aligner goes through three stages of training. The
first pass of alignment uses monophone models, where each phone is modelled
the same regardless of phonological context. The second pass uses triphone
models, where context on either side of a phone is taken into account for
acoustic models. The final pass enhances the triphone model by taking
into account speaker differences, and calculates a transformation of the
mel frequency cepstrum coefficients (MFCC) features for each speaker.

Use of speaker information
--------------------------

A key feature of the Montreal Forced Aligner is the use of speaker
adaptatation in alignment. The command line interface provides multiple
ways of grouping audio files by speaker, depending on the input file format
(either :ref:`prosodylab_format` or :ref:`textgrid_format`).
In addition to speaker-adaptation in the final pass of alignment, speaker
information is used for grouping audio files together for multiprocessing
and ceptstral mean and variance normalization (CMVN). If speakers are not
properly specified, then feature calculation might not succeed due to
limits on the numbers of files open.

Underlying technology
---------------------

Expand All @@ -34,8 +68,8 @@ The Montreal Forced Aligner uses the Kaldi ASR toolkit
Kaldi is under active development and uses modern ASR and includes state-of-the-art algorithms for tasks
in automatic speech recognition beyond forced alignment.

Relation to other forced alignment tools
----------------------------------------
Other forced alignment tools
============================

Most tools for forced alignment used by linguists rely on the HMM Toolkit
(HTK; `HTK homepage`_), including:
Expand All @@ -45,29 +79,29 @@ Most tools for forced alignment used by linguists rely on the HMM Toolkit
* FAVE-align (`FAVE-align homepage`_)
* (Web) MAUS(`MAUS homepage`_)

Praat (`Praat homepage`_)
has a built-in aligner as well.
EasyAlign (`EasyAlign homepage`_)
is a Praat plug-in built to facilitate its use.


Praat (`Praat homepage`_) has a built-in aligner as well.
EasyAlign (`EasyAlign homepage`_) is a Praat plug-in built to facilitate its use.

Montreal Forced Aligner is most similar to the Prosodylab-aligner, and
was developed at the same lab. Because the Montreal Forced Aligner uses
a different toolkit to do alignment, trained models cannot be used with
the Prosodylab-aligner, and vice versa.

Contributors
------------
============

* Michael McAuliffe
* Michael McAuliffe ([email protected], `Github`_, `@wavable`_)
* Michaela Socolof
* Sarah Mihuc
* Michael Wagner

Citation
--------
========

McAuliffe, Michael, Michaela Socolof, Sarah Mihuc, and Michael Wagner (2016).
Montreal Forced Aligner [Computer program]. Version 0.5,
retrieved 13 July 2016 from http://montrealcorpustools.github.io/Montreal-Forced-Aligner/.

Funding
-------
=======

0 comments on commit e392571

Please sign in to comment.