This repo is the official implementation for Historical Astronomical Diagrams Decomposition in Geometric Primitives.
This repo builds on the code for DINO-DETR, the official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection".
We present a model which modifies DINO-DETR to perform historical astronomical diagram vectorization by predicting simple geometric primitives, such as lines, circles, and arcs.
1. Installation
The model was trained with python=3.11.0
, pytorch=2.1.0
, cuda=11.8
and
builds on the DETR-variants DINO/DN/DAB and Deformable-DETR.
- Clone this repository and create virtual environment
git clone [email protected]:vayvi/HDV.git cd HDV/ python3 -m venv venv source venv/bin/activate
- Follow instructions to install a Pytorch version compatible with your system and CUDA version
- Install other dependencies
pip install -r requirements.txt
- Compiling CUDA operators
python src/models/dino/ops/setup.py build install # 'cuda not availabel', run => export CUDA_HOME=/usr/local/cuda-<version> # unit test (should see all checking is True) # could output an outofmemory error python src/models/dino/ops/test.py
- Installing the local package for synthetic data generation
pip install -e synthetic/.
2. Annotated Dataset and Model Checkpoint
Our annotated dataset along with our main model checkpoints can be found here. Annotations are in SVG format. We provide helper functions for parsing svg files in Python if you would like to process a custom annotated dataset.
To download the manually annotated dataset, run:
bash scripts/download_eida_data.sh
Datasets should be organized as follows:
HDV/
data/
└── eida_dataset/
└── images_and_svgs/
└── custom_dataset/
└── images_and_svgs/
To download the pretrained models, run:
bash scripts/download_pretrained_models.sh
Checkpoints should be organized as follows:
HDV/
logs/
└── main_model/
└── checkpoint0012.pth
└── checkpoint0036.pth
└── config_cfg.py
└── other_model/
└── checkpoint0044.pth
└── config_cfg.py
...
You can process the ground-truth data for evaluation using:
bash scripts/process_annotated_data.sh "eida_dataset" # or "custom_dataset", etc.
3. Synthetic Dataset
The synthetic dataset generation process requires a resource of text and document backgrounds. We use the resources available in docExtractor and diagram-extraction. The code for generating the synthetic data is also heavily based on docExtractor.
To get the synthetic resource (backgrounds) for the synthetic dataset you can launch:
bash scripts/download_synthetic_resource.sh
Download the synthetic resource folder here and unzip it in the data folder.
1. Evaluate our pretrained models
After downloading and processing the evaluation dataset, you can evaluate the pretrained model as follows. Download a model checkpoint:
model_name
corresponds to the folder insidelogs/
where the checkpoint file is locatedepoch_number
epoch number of the checkpoint file to be useddata_folder_name
is the name of the folder insidedata/
where the evaluation dataset is located (default toeida_dataset
)
bash scripts/evaluate_on_eida_final.sh <model_name> <epoch_number> <data_folder_name>
# for logs/main_model/checkpoint0036.pth on eida_dataset
bash scripts/evaluate_on_eida_final.sh main_model 0036 eida_dataset
# for logs/eida_demo_model/checkpoint0044.pth on eida_dataset
bash scripts/evaluate_on_eida_final.sh eida_demo_model 0044 eida_dataset
You should get the AP for different primitives and for different distance thresholds.
If you want to run evaluation on all checkpoints available for a given model, you can use the following script:
bash scripts/evaluate_models_on_gt.sh <ground_truth> <?model_name> <?device_nb> <?batch_size> <?max_size>
# to evaluate all available models on ground truth (cf. svg_to_train.py script)
bash scripts/evaluate_models_on_gt.sh eida_dataset/groundtruth
# to evaluate only one model
bash scripts/evaluate_models_on_gt.sh eida_dataset/groundtruth main_model
2. Inference and Visualization
For inference and visualizing results over custom images, you can use this notebook.
You can also use the following script to run inference on a whole dataset (jpg images located in data/<data_set>/images/
):
bash scripts/run_inference.sh <model_name> <epoch_number> <data_set> <export_formats>
# for logs/main_model/checkpoint0036.pth on eida_dataset with svg and npz export formats
bash scripts/run_inference.sh main_model 0036 eida_dataset svg+npz
Results will be saved in data/<data_set>/<export_format>_preds_<model_name><epoch_number>/
.
You can compare different inferences on the same dataset with (outputs an HTML file data/<data_set>/<filename>.html
):
python src/util/html.py --data_set <data_set> --filename <filename>
1. Training from scratch on synthetic data
To re-train the model from scratch on the synthetic dataset (created on the fly), you can launch
bash scripts/train_model.sh
2. Training on a custom dataset
Turn SVG files into COCO-like annotations using the following script:
data_set
folder insidedata/
where the evaluation dataset is located (default toeida_dataset
)sanity_check
add it whether you want to visualize the processed annotations (will save the images indata/<data_set>/svgs/
)train_portion
float value in between 0 and 1 to split the dataset into train and val (default to0.8
)
data/
└── <dataset_name>/
└── images/ # folder containing annotated images in the svgs folder
└── svgs/ # folder containing SVG files containing ground truth for training
python src/svg_to_train.py --data_set <dataset_name> --sanity_check
# for eida_dataset
python src/svg_to_train.py --data_set eida_dataset --sanity_check
Training data will be created in data/<dataset_name>/groundtruth/
. You can use it to run the finetuning script.
To train on a custom dataset, the ground truth annotations should be in a COCO-like format, thus be structured as follows:
data/
└── <groundtruth_data>/
└── annotations/ # folder containing JSON files (one for train, one for val) in COCO-like format
└── train/ # train images (corresponding to train.json)
└── val/ # val images (corresponding to val.json)
Run the following script to train the model on the custom dataset:
model_name
corresponds to the folder insidelogs/
where the checkpoint file is located (will take the last checkpoint)groundtruth_dir
relative path to a folder insidedata/
where the ground truth dataset is locateddevice_nb
GPU device number to use for training (default to0
)batch_size
batch size for training (default to2
)max_size
maximum image size for data augmentation (default to1000
), to prevent out of memory errorslearning_rate
learning rate for training (default to0.0001
)epoch_nb
number of epochs to train (default to50
)
bash scripts/finetune_model.sh <model_dirname> <groundtruth_dir> <device_nb> <batch_size> <max_size> <learning_rate> <epoch_nb>
# to use the data generated by the previous script to finetuning main_model on device #2
bash scripts/finetune_model.sh main_model eida_dataset/groundtruth 2
The outputs of your run will be logged with wandb.
If you find this work useful, please consider citing:
@misc{kalleli2024historical,
title={Historical Astronomical Diagrams Decomposition in Geometric Primitives},
author={Syrine Kalleli and Scott Trigg and Ségolène Albouy and Matthieu Husson and Mathieu Aubry},
year={2024},
eprint={2403.08721},
archivePrefix={arXiv},
primaryClass={cs.CV}
}