This repo is the official implementation of the AAAI 2024 paper "DocNLC: A Document Image Enhancement Framework with Normalized and Latent Contrastive Representation for Multiple Degradations"
DocNLC: A Document Image Enhancement Framework with Normalized and Latent Contrastive Representation for Multiple Degradations
torch == 1.7.1+cu101
numpy == 1.19.2
opencv-python == 4.5.1.48
The structure of the training data is shown below:
├── Hybrid/
│ ├── Degraded/
│ │ ├── Blur/
│ │ ├── Noise/
│ │ ├── Shadow/
│ │ ├── Watermark/
│ │ ├── WithBack/
To generate the training dataset, run:
python generate_training_dataset.py (Coming soon)
Or download from: Pre-training Dataset (21.5G)
We control our hyper-parameters, such as batch size or learning rate, through exclusive yaml files. They are stored in the options folder. For pre-training, fine-tuning and testing, you should specify an appropriate yaml file. We have provided a sample file in the options folder.
- Edit ./options/pretrain.yml
python pretrain.py
- Edit ./options/finetune.yml
python finetune.py
- Edit ./options/test.yml
python test.py
Note that the terminal output during the PSNR test is meaningless. In the next step we will evaluate the output images using the standard skimage.metrics.
Pretrained Model | Pretrained Model |
---|---|
Asymmetric Comparison | One Drive |
Symmetric Comparison | One Drive |
Our work is based on the following theoretical works:
and we are benefiting a lot from the following projects:
@inproceedings{wang2024docnlc,
title={DocNLC: A Document Image Enhancement Framework with Normalized and Latent Contrastive Representation for Multiple Degradations},
author={Wang, Ruilu and Xue, Yang and Jin, Lianwen},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={38},
number={6},
pages={5563--5571},
year={2024}
}