Controlling Rate, Distortion, and Realism: Towards a Single Comprehensive Neural Image Compression Model [WACV2024]
This is the official PyTorch implementation of
"Controlling Rate, Distortion, and Realism: Towards a Single Comprehensive Neural Image Compression Model (WACV2024, Oral)".
Shoma Iwai, Tomo Miyazaki, and Shinichiro Omachi
We have proposed a single image compression model that can control bitrate, distortion, and realism. As shown below, our method can cover a wide range of rate-distortion-realism points within a single model. Please refer to our paper for more details.
- 2024/6/19: Training code, instructions, reproduced training log, and pre-trained stage-2 model are released! See training.md.
0. Install poetry.
curl -sSL https://install.python-poetry.org | python3 -
We used poetry==1.6.1
.
https://github.com/iwa-shi/CRDR.git
poetry install
PyTorch 1.12.1
CUDA 11.3
CompressAI 1.2.4
Other PyTorch/CUDA versions might work, but these are the environments that we tested.
You can download it from GDrive manually or use the following command:
curl "https://drive.usercontent.google.com/download?id=1H6T9-k0RX5SXk0VljHNiXUGXZrZl2seb&confirm=xxx" -o crdr.pth.tar
You can run a quick test on the images in ./demo_images
(three images from Kodak dataset) using the following command:
poetry run python scripts/compress.py --config_path ./config/crdr.yaml --model_path ./crdr.pth.tar --img_dir ./demo_images --save_dir ./demo_results/crdr_q000_b384_kodak -q 0.00 -b 3.84 --decompress -d cuda
Binary compressed files (kodimXXX.bin
), reconstructions (kodimXXX.png
), _bitrates.csv
, and _avg_bitrate.json
will be stored in ./demo_results/crdr_q000_b384_kodak
.
The average bitrate should be avg_bpp: 0.0641
. Please verify that the reconstructions look correct.
Note
Currently, only single-GPU is supported. If you are using a multi-GPU machine, you can specify CUDA_VISIBLE_DEVICES={DEVICE_ID}
to avoid unexpected behavior.
You can run the pre-trained model on your dataset, such as the CLIC2020 test dataset and Kodak dataset.
poetry run python scripts/compress.py --config_path ./config/crdr.yaml --model_path ./crdr.pth.tar --img_dir PATH/TO/DATASET --save_dir ./results/crdr_qXXX_bXXX_DATASET -q QUALITY -b BETA --decompress -d cuda
--quality
adjusts the bitrate. Float value from0.0
to4.0
in our pre-trained model.0.0
: Low bitrate,4.0
: High bitrate--beta
adjusts realism. Float value from0.0
to5.12
.0.0
: Low distortion,5.12
: High realism
In Fig.5 in our paper, --quality
: {0.0, 0.25, 0.5, ..., 3.5, 3.75, 4.0}
, --beta
: {0.0, 3.84}
are used (17*2=34
points in total).
For example,
poetry run python scripts/compress.py --config_path ./config/crdr.yaml --model_path ./crdr.pth.tar --img_dir ./datasets/CLIC/test --save_dir ./results/crdr_q150_b384_CLIC -q 1.5 -b 3.84 --decompress -d cuda
You can calculate metrics (PSNR, FID, LPIPS, DISTS) of the reconstructions by running the following script:
poetry run python scripts/calc_metrics.py --real_dir PATH/TO/DATASET --fake_dir PATH/TO/RECONSTRUCTIONS -d cuda
Results will be stored in {fake_dir}/_metrics.json
.
For example,
poetry run python scripts/calc_metrics.py --real_dir ./datasets/CLIC/test --fake_dir ./results/crdr_q150_b384_CLIC -d cuda
You can find results on CLIC2020, Kodak, and DIV2K at rd_results.
See ./docs/training.md.
- Release repository
- Pre-trained model
- Test code and instructions
- Training code and instructions
If you find this code useful for your research, please consider citing our paper:
@INPROCEEDINGS{iwai2024crdr,
author={Iwai, Shoma and Miyazaki, Tomo and Omachi, Shinichiro},
booktitle={2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
title={Controlling Rate, Distortion, and Realism: Towards a Single Comprehensive Neural Image Compression Model},
year={2024},
volume={},
number={},
pages={2888-2897},
doi={10.1109/WACV57701.2024.00288}}
We thank the authors of the following repositories:
- CompressAI: For entropy_model.
- ELIC (unofficial implementation by JiangWeibeta)
- HiFiC
- HiFiC (unofficial PyTorch Implementation by Justin-Tan)
- tensorflow-compression
- MMEngine: For config and registry.
If you have any questions or encounterd any issues, please feel free to open issue.