Skip to content

Latest commit

 

History

History
51 lines (33 loc) · 1.47 KB

README.md

File metadata and controls

51 lines (33 loc) · 1.47 KB

PDF to Markdown Converter

This application allows you to convert PDF documents to markdown format using the pymupdf4llm library. It features a Gradio interface for easy file uploads and conversion, displaying both the markdown content and the time taken for the conversion.

PyMuPDF4llm API Documentation

You can see the PyMuPDF4llm API documentation here.

Features

  • Convert PDF files to markdown format.
  • Upload PDF files through a Gradio web interface.
  • Automatically saves the output markdown file using the original filename with a .md extension in the output folder.
  • Displays the markdown content and the conversion time directly on the Gradio interface.

Installation

  1. Clone the repository:

    git clone <repository-url>
    cd test-pymupdf4llm
  2. Install the dependencies:

    poetry install
  3. Ensure you have the required Python version:

    • Python 3.10 or higher

Usage

  1. Run the application:

    poetry run python src/test_pymupdf4llm/main.py
  2. Open the Gradio interface in your web browser.

  3. Upload a PDF file to convert it to markdown format.

  4. The markdown content and conversion time will be displayed on the interface, and a file will be saved in the output directory.

License

This project is licensed under the MIT License.

Contact

For any inquiries, please contact the author at [email protected].