You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I can take this up. I've previously used marker for some of my personal projects and it works really well across various document type. I can try both markitdown and marker, and compare the results.
I can take this up. I've previously used marker for some of my personal projects and it works really well across various document type. I can try both markitdown and marker, and compare the results.
Motivation
The data cleaning part of the document-to-podcast workflow could be made more robust, as currently it does not take into account all possible cases.
Alternatives
Have also considered: https://github.com/VikParuchuri/marker, which could be an interesting alternative to 'markitdown'.
Contribution
Re-implementing the data cleaning component to use markitdown
This would make the data cleaning component for robust, and potentially re-useable across many Blueprints.
There is potential to submit a PR related to updating the data cleaning compoenent to leverage markitdown
Have you searched for similar issues before submitting this one?
The text was updated successfully, but these errors were encountered: