This application allows you to convert PDF documents to markdown format using the pymupdf4llm
library. It features a Gradio interface for easy file uploads and conversion, displaying both the markdown content and the time taken for the conversion.
You can see the PyMuPDF4llm API documentation here.
- Convert PDF files to markdown format.
- Upload PDF files through a Gradio web interface.
- Automatically saves the output markdown file using the original filename with a
.md
extension in theoutput
folder. - Displays the markdown content and the conversion time directly on the Gradio interface.
-
Clone the repository:
git clone <repository-url> cd test-pymupdf4llm
-
Install the dependencies:
poetry install
-
Ensure you have the required Python version:
- Python 3.10 or higher
-
Run the application:
poetry run python src/test_pymupdf4llm/main.py
-
Open the Gradio interface in your web browser.
-
Upload a PDF file to convert it to markdown format.
-
The markdown content and conversion time will be displayed on the interface, and a file will be saved in the
output
directory.
This project is licensed under the MIT License.
For any inquiries, please contact the author at [email protected].