thai-parliament-speech-dataset

Segmented speech files from the Thai Parliament meetings. Could be used to train ASR/speech-to-text systems.

TODO

Exclude files that are too short or too long (0.7s-15.0s should be fine), not clearly heard, or have overlapping speakers or other conditions that could confuse the ASR system
Create a new Common Voice-style CSV file for training
Transcribe segmented audio. Right now, this repository only contain audio files and no transcriptions (text) yet.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
2022-07-21		2022-07-21
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md