Skip to content

Recipe for applying the embedded segmental k-means model to the ZeroSpeech2017 Track 2 challenge.

Notifications You must be signed in to change notification settings

kamperh/recipe_zs2017_track2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Embedded Segmental K-Means for ZeroSpeech2017 Track 2

Warning

This is a preliminary version of our system. This is not a final recipe, and is still being worked on.

Overview

A description of the challenge can be found here: http://sapience.dec.ens.fr/bootphon/2017/index.html.

Disclaimer

The code provided here is not pretty. But I believe that research should be reproducible, and I hope that this repository is sufficient to make this possible for the paper mentioned above. I provide no guarantees with the code, but please let me know if you have any problems, find bugs or have general comments.

Preliminaries

Clone the zerospeech repositories:

mkdir ../src/
git clone https://github.com/bootphon/zerospeech2017.git \
    ../src/zerospeech2017/
# To-do: add installation and data download instructions
git clone https://github.com/bootphon/zerospeech2017_surprise.git \
    ../src/zerospeech2017_surprise/

Clone the eskmeans repository:

git clone https://github.com/kamperh/eskmeans.git \
    ../src/eskmeans/

Get the surprise data:

cd ../src/zerospeech2017_surprise/
source download_surprise_data.sh \
    /share/data/lang/users/kamperh/zerospeech2017/data/surprise/
cd -

Update all the paths in paths.py to match your directory structure.

Feature extraction

Extract MFCC features by running the steps in features/readme.md.

Unsupervised syllable boundary detection

We use the unsupervised syllable boundary detection algorithm described in:

  • O. J. Räsänen, G. Doyle, and M. C. Frank, "Unsupervised word discovery from speech using automatic segmentation into syllable-like units," in Proc. Interspeech, 2015.

Obtain the syllabe boundaries by running the steps in syllables/readme.md.

Acoustic word embeddings: downsampling

We use one of the simplest methods to obtain acoustic word embeddings: downsampling. Different types of input features can be used. Run the steps in downsample/readme.md.

Unsupervised segmentation and clustering

Segmentation and clustering is performed using the ESKMeans package. Run the steps in segmentation/readme.md.

Dependencies

Collaborators

About

Recipe for applying the embedded segmental k-means model to the ZeroSpeech2017 Track 2 challenge.

Resources

Stars

Watchers

Forks

Packages

No packages published