PyTorch codes for "Exploring Distortion Prior with Latent Diffusion Models for Remote Sensing Image Compression", xxxxxx, 2024.
- Authors: Junhui Li, Jutao Li, Xingsong Hou, and Huake Wang
Learning-based image compression algorithms typically focus on designing encoding and decoding networks and improving the accuracy of entropy model estimation to enhance the rate-distortion (RD) performance. However, few algorithms leverage the compression distortion prior from existing compression algorithms to improve RD performance. In this paper, we propose a latent diffusion model-based remote sensing image compression (LDM-RSIC) method, which aims to enhance the final decoding quality of RS images by utilizing the generated distortion prior from a LDM. Our approach consists of two stages. In Stage I, a self-encoder learns prior from the high-quality input image. In Stage II, the prior is generated through a LDM, conditioned on the decoded image of an existing learning-based image compression algorithm, to be used as auxiliary information for generating the texture-rich enhanced images. To better utilize the prior, a channel attention and gate-based dynamic feature attention module (DFAM) is embedded into a Transformer-based multi-scale enhancement network (MEN) for image enhancement. Extensive experimental results demonstrate the proposed LDM-RSIC outperforms existing state-of-the-art traditional and learning-based image compression algorithms in terms of both subjective perception and objective metrics.
git clone https://github.com/mlkk518/LDM-RSIC.git
- CUDA 11.7
- Python 3.7.12
- PyTorch 1.13.1
- Torchvision 0.14.1
Please download the following remote sensing benchmarks: Experimental Datasets: DOTA-v1.5 | UC-M
Testing set (Baidu Netdisk) DOTA:Download Code:ldc1 | UC_M:Download Code:pvf3
Download Pre-trained Model (Baidu Netdisk) Code:v72j
-
Step I. Change the roots of ./ELIC/scripts/test.sh to your data and Use the pretrained models of [ELIC] to generate the initial decoded images.
-
Step II. Refer to test_DiffRS2_lambda.yml to set the data roots and pretrained models of [LDM], and run sh ./scriptEn/test.sh Lambada Gpu_ID. Here lambda belongs to [0.0004, 0.0008, 0.0032, 0.01, 0.045]
sh ./ELIC/scripts/test.sh 0.0008 0
sh ./scriptEn/test.sh 0.0008 0
- Step II. Learning the compression distortion prior.
- Step II. Using LDM to generate distortion prior, which is then fed into MEN for improved images.
sh ./scriptEn/trainS1.sh 0.0008 0
sh ./scriptEn/trainS2.sh 0.0008 0
If you have any questions or suggestions, feel free to contact me. 😊
Email: [email protected]
If you find our work helpful in your research, please consider citing it. We appreciate your support!😊
This work was supported by:
@article{li2024ldm,
title={Exploring Distortion Prior with Latent Diffusion Models for Remote Sensing Image Compression},
author={Li, Junhui and Li, Jutao and Hou, Xingsong and Wang, Huake},
journal={arXiv preprint arXiv:2406.03961},
year={2024}
}