Skip to content

Segmented speech files from the Thai Parliament meetings. Could be used to train ASR/speech-to-text systems.

License

Notifications You must be signed in to change notification settings

ppnplus/thai-parliament-speech-dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

thai-parliament-speech-dataset

Segmented speech files from the Thai Parliament meetings. Could be used to train ASR/speech-to-text systems.

TODO

  • Exclude files that are too short or too long (0.7s-15.0s should be fine), not clearly heard, or have overlapping speakers or other conditions that could confuse the ASR system
  • Create a new Common Voice-style CSV file for training
  • Transcribe segmented audio. Right now, this repository only contain audio files and no transcriptions (text) yet.

About

Segmented speech files from the Thai Parliament meetings. Could be used to train ASR/speech-to-text systems.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published