These are transcriptions for the recordings in the CLSU Foreign Accented Speech Corpus of non-native English speech. The transcriptions were provided by workers on Mechanical Turk.
Due to resource-constraints, we only collected transcriptions for about a third of the data, coming from native speakers of 7 of the 23 languages (Arabic, Czech, French, Hindi, Indonesian, Mandarin, and Korean).
If you use these transcriptions, please cite Emily's senior thesis, A Computational Approach to Foreign Accent Classification, Wellesley College, 2016, available here.