Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Add speech recognition training & other improvements #147

Merged
merged 36 commits into from
Feb 7, 2024
Merged

Conversation

arxyzan
Copy link
Member

@arxyzan arxyzan commented Feb 7, 2024

Pull Request

Description

This PR mainly adds support for ASR regarding dataset, training, collator, etc. But in the process some other improvments and bug fixes were also introduced.

Changes

  • Add SpeechRecognitionDataset and SpeechRecognitionDataCollator
  • Add SpeechRecognitionMetricsHandler
  • Add training script for ASR in examples
  • Fix bugs in Tokenizer
  • Fix bugs in AudioFeatureExtractor
  • Fix bugs in config
  • Add dataset load script to templates for ASR
  • Improve tests

Related Issues

Mainly focuses on #72

Checklist

  • I have read and followed the project's contributing guidelines.
  • My code follows the project's coding style.
  • I have tested my changes thoroughly.
  • I have updated the documentation if necessary.
  • All existing tests pass.
  • I have added new tests to cover my changes.
  • My changes do not introduce any new warnings or errors.

Additional Comments

Reviewer Instructions

Author's Note

Setting `max_length` in tokenizer call had unexpected behavior
@arxyzan arxyzan merged commit d7dadd6 into main Feb 7, 2024
1 check passed
@arxyzan arxyzan mentioned this pull request Feb 7, 2024
@arxyzan arxyzan deleted the asr-training branch February 10, 2024 12:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant