Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(converter): add video converter. #193

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

absadiki
Copy link

@absadiki absadiki commented Dec 21, 2024

This PR is a step towards resolving #154.

It introduces a VideoConverter class that converts videos to markdown by:

  • Extracting metadata (if exiftool is installed)
  • Performing speech transcription (if speech_recognition and pydub are installed)
  • Generating a summary via a multimodal LLM from the transcription [This is optional and defaults to True if llm_client is configured]

Notes:

  • I believe checking the file type based on the extension is not ideal. There are many video extensions, and I think checking the mime_type would be a better approach, as it can cover a wider range of video files.
  • I’m unsure about the testing strategy .. should we focus only on testing exiftool? Please share your thoughts on this.
  • Additionally, I suggest refactoring Mp3Converter into a more general AudioConverter, as there are many audio extensions to consider. If you agree with this, I can submit a separate PR for it.

@l-lumin
Copy link
Contributor

l-lumin commented Dec 22, 2024

could you add tests?

@absadiki
Copy link
Author

@l-lumin, could you provide a sample video file that is allowed to be uploaded to the repo?

@l-lumin
Copy link
Contributor

l-lumin commented Dec 23, 2024

@l-lumin, could you provide a sample video file that is allowed to be uploaded to the repo?

I think you can use the file you tested locally.If it's wrong, can change it later

@absadiki
Copy link
Author

@l-lumin, okey I created a sample video file using ffmpeg. I've added test for exiftool for now.
Maybe we can add tests for transcription as well, but #194 should be merged first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants