Skip to content

sweekarsud/Goodness-of-Pronunciation

Repository files navigation

Goodness of Pronunciation (GoP)

This code reflects the work described in the INTERSPEECH 2019 published paper on "An improved goodness of pronunciation (GoP) measure for pronunciation evaluation with DNN-HMM system considering HMM transition probabilities".

Requirements :

  • Python (tested with v.2.7.5 & v.3.5.7).
  • Kaldi ASR toolkit (for documentation checkout : http://kaldi-asr.org/) considering acoustic models trained with nnet2 (Dan's recipe) (tested with nnet2 & nnet3) on LibriSpeech.

How to run the code :

Run the below code (prop_gop_eqn.py) to compute the score using the proposed GoP formulation by passing alignment_infile.txt and posterior_infile.ark generated for a given learner's utterance.

python prop_gop_eqn.py posterior_infile.ark alignment_infile.txt gop_outfile.txt
  • The alignment_infile.txt file is the output of the forced-alignment of the learner's uttered speech (.wav file) and this is obtained using align.sh.
  • The posterior_infile.ark file contains the frame level posterior-probabilities of the learner's uttered speech (.wav file) and this is obtained using nnet_am_compute.cc.
  • The gop_outfile.txt file contains the score for each phoneme.

NOTE :

  • The above python script requires a lookup table to generate the scores for an acoustic model as discussed in the paper, which can be generated using the following code :
./gen_lookup_table.sh

Placement of the downloaded folder :

  • Once the Goodness-of-Pronunciation-master.zip file is downloaded it needs to be placed in /home/user/kaldi/egs/Native_Acoustic_Model/s5/ and needs to unzipped as Extract Here which will result in the creation of the following path /home/user/kaldi/egs/Native_Acoustic_Model/s5/Goodness-of-Pronunciation-master/. The native acoustic model needs to be trained on nnet2 with all paths functional in exp folder.
  • Once the path is created it will have the following file structure :
├── kaldi_folder
│   ├── native_acoustic_model
│   │   ├── s5
│   │   │   ├── Goodness-of-Pronunciation-master
│   │   │   │   ├── extract_from_alignments.sh
│   │   │   │   ├── gen_lookup_table.sh
│   │   │   │   ├── modify_post.sh
│   │   │   │   ├── extract_from_alignments.sh
│   │   │   │   ├── gop_outfile.txt
│   │   │   │   ├── prop_gop_eqn.py
│   │   │   │   ├── reqd_files
│   │   │   │   │   ├── alignment_infile.txt
│   │   │   │   │   ├── posterior.txt
│   │   │   │   │   ├── posterior_infile.ark
│   │   │   │   │   ├── show_transitions.txt
│   │   │   │   │   ├── lookup_table.txt
│   │   │   │   │   ├── tmp_t_ids.txt
│   │   │   │   │   ├── tmp_phones.txt
│   │   │   │   │   ├── tmp_segments.txt

Citing:

If you find our work useful, please cite:

@inproceedings{Sudhakara2019,
  author={Sweekar Sudhakara and Manoj Kumar Ramanathi and Chiranjeevi Yarra and Prasanta Kumar Ghosh},
  title={{An Improved Goodness of Pronunciation (GoP) Measure for Pronunciation Evaluation with DNN-HMM System Considering HMM Transition Probabilities}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={954--958},
  doi={10.21437/Interspeech.2019-2363},
  url={http://dx.doi.org/10.21437/Interspeech.2019-2363}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published