You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This code has been developed under Python 3.5, Pytorch 1.0.0 and CUDA 10.0.
Please run requirements.py to check if all required packages are installed.
The dataset used in this project is COCO 2014. The dataset is available here.
Training
The script train.py is used for training. The parameters are listed in params.json. Note that there are two different configurations for best performance on the image captioning and text-to-image synthesis tasks.
Example usage to train a model on COCO 2014 for captioning is,
python train.py --config params_i2t
Example usage to train a model on COCO 2014 for text-to-image synthesis task is,
python train.py --config params_t2i
Note that for training CUDA 10.0 and GPU devices are required. The number of GPUs used can be set in params.json. Also note that we use 1 Nvidia Volta V100 GPU and 3 Nvidia Volta V100 GPUs with 32GB for the captioning and text-to-image synthetis tasks respectively.
Generation and Validation
For evalutaion we use the following repos,
Oracle - We use the version of pycocoeval cap which supports Python 3 available here.