Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements and tweaks to correction process #2

Open
beveradb opened this issue Nov 17, 2023 · 0 comments
Open

Improvements and tweaks to correction process #2

beveradb opened this issue Nov 17, 2023 · 0 comments

Comments

@beveradb
Copy link
Collaborator

beveradb commented Nov 17, 2023

Currently the output from the correction process is good but not quite good enough for general release, even for the test track:
https://github.com/karaokenerds/python-lyrics-transcriber/releases/download/v0.12.1/ABBAUnderAttack-mp3_b5bafd91e7421f0635baa9005f5a3119.mp4

Noticeable issues include:

  • Some overlapping / confusing lyrics (e.g. when there are background vocals at the same time as primary)
    • The scrolling lyrics should ideally only track the primary vocals for now; although this does inspire me that we could possible handle duets with this approach in future!
    • One approach to resolving this might be to separate the audio with a background-vocals (BVE) model first and only pass the primary vocal stem to whisper in the first place. TBD if this helps.
  • One or two lines of totally unexpected lyrics; need to investigate why. I suspect these are also backing vocals which couldn't find a place.
  • At least one case where there was still a misheard word left in the corrected lyrics ("shattered" vs. "flattered"); this is less critical but possibly an opportunity to tweak the prompt to give it more guidance, e.g. asking it to watch out for sound-alike words like this and correct them.

It would be good to add some more functionality to the correction method to help debug and tweak the prompt, e.g. logging changes to each segment to a file, printing / writing an overall diff of the spotify lyrics vs. the corrected lyrics, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant