You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Download the preprocessed metadata for birdscoco and extract them to data/
Download the birds image data. Extract them to data/birds/
Download coco2014 dataset and extract the images to data/coco/images/
Training
cd GALIP/code/
Train the GALIP model
For bird dataset: bash scripts/train.sh ./cfg/bird.yml
For coco dataset: bash scripts/train.sh ./cfg/coco.yml
Resume training process
If your training process is interrupted unexpectedly, set state_epoch, log_dir, and pretrained_model_path in train.sh to resume training.
TensorBoard
Our code supports automate FID evaluation during training, the results are stored in TensorBoard files under ./logs. You can change the test interval by changing test_interval in the YAML file.
For bird dataset: tensorboard --logdir=./code/logs/bird/train --port 8166
For coco dataset: tensorboard --logdir=./code/logs/coco/train --port 8177
Evaluation
Download Pretrained Model
GALIP for COCO. Download and save it to ./code/saved_models/pretrained/
GALIP for CC12M. Download and save it to ./code/saved_models/pretrained/
Evaluate GALIP models
cd GALIP/code/
set pretrained_model in test.sh
For bird dataset: bash scripts/test.sh ./cfg/bird.yml
For COCO dataset: bash scripts/test.sh ./cfg/coco.yml
For CC12M (zero-shot on COCO) dataset: bash scripts/test.sh ./cfg/coco.yml
Performance
The released model achieves better performance than the paper version.
Model
COCO-FID↓
COCO-CS↑
CC12M-ZFID↓
GALIP(paper)
5.85
0.3338
12.54
GALIP(released)
5.01
0.3379
12.54
Sampling
Synthesize images from your text descriptions
the sample.ipynb can be used to sample
Citing GALIP
If you find GALIP useful in your research, please consider citing our paper:
@inproceedings{tao2023galip,
title={GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis},
author={Tao, Ming and Bao, Bing-Kun and Tang, Hao and Xu, Changsheng},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={14214--14223},
year={2023}
}
The code is released for academic research use only. For commercial use, please contact Ming Tao (陶明) (mingtao2000@126.com).