Some doc updates

MontrealCorpusTools · Jul 11, 2016 · e392571 · e392571
1 parent 2c9c11d
commit e392571
Show file tree

Hide file tree

Showing 5 changed files with 59 additions and 20 deletions.
diff --git a/docs/source/aligning.rst b/docs/source/aligning.rst
@@ -38,7 +38,7 @@ Align using pretrained models
 The Montreal Forced Aligner comes with pretrained models/dictionaries for:
 
 - English - trained from the LibriSpeech data set (`LibriSpeech corpus`_)
-- Quebec French
+- Quebec French - coming soon
 
 Command template:
 

diff --git a/docs/source/data_format.rst b/docs/source/data_format.rst
@@ -4,7 +4,9 @@
 Data formats
 ************
 
-Prosodylab-Aligner format
+.. _prosodylab_format:
+
+Prosodylab-aligner format
 =========================
 
 Every .wav sound file you are aligning must have a corresponding .lab
@@ -35,6 +37,8 @@ for words and a tier for phones.
 
 <<PICTURE OF OUTPUT TEXTGRID - ALA A LIBRISPEECH UTTERANCE>>
 
+.. _textgrid_format:
+
 TextGrid format
 ===============
 
@@ -76,7 +80,7 @@ be replaced in the output with '<unk>' for unknown word.
    the unknown words per utterance.
 
 As part of parsing orthographic transcriptions, punctuation is stripped
-from the ends of words.  In addition, all words are converted to lowercase 
+from the ends of words.  In addition, all words are converted to lowercase
 so that dictionary lookup is not case-sensitive.
 
 Dictionary lookup will attempt to generate the most maximal coverage of

diff --git a/docs/source/dictionary.rst b/docs/source/dictionary.rst
@@ -1,4 +1,3 @@
-.. _dictionary:
 
 .. _`LibriSpeech lexicon`: http://www.openslr.org/resources/11/librispeech-lexicon.txt
 
@@ -8,6 +7,7 @@
 
 .. _`Prosodylab-aligner French dictionary`: https://github.com/prosodylab/prosodylab-alignermodels/blob/master/FrenchQuEu/fr-QuEu.dict
 
+.. _dictionary:
 
 ************
 Dictionaries

diff --git a/docs/source/installation.rst b/docs/source/installation.rst
@@ -1,9 +1,10 @@
-.. _installation:
 
 .. _`Montreal Forced Aligner releases`: https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/releases
 
 .. _`Kaldi GitHub repository`: https://github.com/kaldi-asr/kaldi
 
+.. _installation:
+
 ************
 Installation
 ************

diff --git a/docs/source/introduction.rst b/docs/source/introduction.rst
@@ -1,4 +1,3 @@
-.. _introduction:
 
 .. _`Kaldi homepage`: http://kaldi-asr.org/
 
@@ -16,16 +15,51 @@
 
 .. _`EasyAlign homepage`: http://latlcui.unige.ch/phonetique/easyalign.php
 
+.. _`@wavable`: https://twitter.com/wavable
+
+.. _`Github`: http://mmcauliffe.github.io/
+
+.. _introduction:
+
+************
 Introduction
-============
+************
 
 What is forced alignment?
--------------------------
+=========================
 
 Forced alignment is a technique to take an orthographic transcription of
 an audio file and generate a time-aligned version using a pronunciation
 dictionary to look up phones for words.
 
+
+Montreal Forced Aligner
+=======================
+
+Pipeline of training
+--------------------
+
+The Montreal Forced Aligner goes through three stages of training.  The
+first pass of alignment uses monophone models, where each phone is modelled
+the same regardless of phonological context.  The second pass uses triphone
+models, where context on either side of a phone is taken into account for
+acoustic models.  The final pass enhances the triphone model by taking
+into account speaker differences, and calculates a transformation of the
+mel frequency cepstrum coefficients (MFCC) features for each speaker.
+
+Use of speaker information
+--------------------------
+
+A key feature of the Montreal Forced Aligner is the use of speaker
+adaptatation in alignment.  The command line interface provides multiple
+ways of grouping audio files by speaker, depending on the input file format
+(either :ref:`prosodylab_format` or :ref:`textgrid_format`).
+In addition to speaker-adaptation in the final pass of alignment, speaker
+information is used for grouping audio files together for multiprocessing
+and ceptstral mean and variance normalization (CMVN).  If speakers are not
+properly specified, then feature calculation might not succeed due to
+limits on the numbers of files open.
+
 Underlying technology
 ---------------------
 
@@ -34,8 +68,8 @@ The Montreal Forced Aligner uses the Kaldi ASR toolkit
 Kaldi is under active development and uses modern ASR and includes state-of-the-art algorithms for tasks
 in automatic speech recognition beyond forced alignment.
 
-Relation to other forced alignment tools
-----------------------------------------
+Other forced alignment tools
+============================
 
 Most tools for forced alignment used by linguists rely on the HMM Toolkit
 (HTK; `HTK homepage`_), including:
@@ -45,29 +79,29 @@ Most tools for forced alignment used by linguists rely on the HMM Toolkit
 * FAVE-align (`FAVE-align homepage`_)
 * (Web) MAUS(`MAUS homepage`_)
 
-Praat (`Praat homepage`_)
-has a built-in aligner as well.
-EasyAlign (`EasyAlign homepage`_)
-is a Praat plug-in built to facilitate its use.
-
-
+Praat (`Praat homepage`_) has a built-in aligner as well.
+EasyAlign (`EasyAlign homepage`_) is a Praat plug-in built to facilitate its use.
 
+Montreal Forced Aligner is most similar to the Prosodylab-aligner, and
+was developed at the same lab.  Because the Montreal Forced Aligner uses
+a different toolkit to do alignment, trained models cannot be used with
+the Prosodylab-aligner, and vice versa.
 
 Contributors
-------------
+============
 
-* Michael McAuliffe
+* Michael McAuliffe ([email protected], `Github`_, `@wavable`_)
 * Michaela Socolof
 * Sarah Mihuc
 * Michael Wagner
 
 Citation
---------
+========
 
 McAuliffe, Michael, Michaela Socolof, Sarah Mihuc, and Michael Wagner (2016).
 Montreal Forced Aligner [Computer program]. Version 0.5,
 retrieved 13 July 2016 from http://montrealcorpustools.github.io/Montreal-Forced-Aligner/.
 
 Funding
--------
+=======