EasyFigs: A Convolutional Neural Network Approach for Extracting Figures from PDFs

Overview

EasyFigs is a machine learning project aimed at detecting figures from PDF documents. This project was born out of the need to improve the process of extracting figures from PDFs, which is often imprecise and time-consuming. Existing PDF Image Extractors are unable to accurately detect useful figures or extract captions.

Data

The original data used for this project consists of research articles from the Arxiv website in PDF format. The dataset includes 870 PDFs. The PDFs are split into pages and converted to images. We selected 900 pages with figures and 100 pages without figures.

Model Selection

The main model used in this project is YoLo v5s, a state-of-art object detection algorithm that uses a Convolutional Neural Network. It has 191 layers and 7.5*106 trainable parameters. The model was chosen for its speed, accuracy, and compatibility with Pytorch and RoboFlow.

Hyperparameters Tuning

Hyperparameters were selected based on the results from the hyperparameters evolution process using a Genetic Algorithm. The model was trained for 150 epochs with a batch size of 32 and an image size of 640*640.

Results

The primary metric used for evaluation is Mean Average Precision (mAP). The model performed well on both the testing dataset and a brand new dataset.

Future Work

While the model performed well, the dataset used was biased. Future work will involve using a larger and more generalized dataset. This work is expected to have a positive impact on the computer vision research community.

Links

Contributions

Kaison Cheung
Xuan Ze (Charlie) Li
Yahui Yang
Yanni Lu

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
__pycache__		__pycache__
images		images
models		models
pdf		pdf
resources/UI		resources/UI
utils		utils
weights		weights
Execution.py		Execution.py
MainMenu.py		MainMenu.py
README.md		README.md
convert_pdf_to_image.py		convert_pdf_to_image.py
detect.py		detect.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EasyFigs: A Convolutional Neural Network Approach for Extracting Figures from PDFs

Overview

Data

Model Selection

Hyperparameters Tuning

Results

Future Work

Links

Contributions

About

Releases

Packages

Languages

kaison428/EasyFigs

Folders and files

Latest commit

History

Repository files navigation

EasyFigs: A Convolutional Neural Network Approach for Extracting Figures from PDFs

Overview

Data

Model Selection

Hyperparameters Tuning

Results

Future Work

Links

Contributions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages