You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Download the dataset files and pre-trained models. We use splits produced by Andrej Karpathy. The precomputed image features are from here and here. To use full image encoders, download the images from their original sources here, here and here.
We refer to the path of extracted files for data.tar as $DATA_PATH and
files for models.tar as $RUN_PATH. Extract vocab.tar to ./vocab
directory.
Update: The vocabulary was originally built using all sets (including test set
captions). Please see issue #29 for details. Please consider not using test set
captions if building up on this project.
--measure order --use_abs --margin .05 --learning_rate .001
Order++
--measure order --max_violation
Reference
If you found this code useful, please cite the following paper:
@article{faghri2018vse++,
title={VSE++: Improving Visual-Semantic Embeddings with Hard Negatives},
author={Faghri, Fartash and Fleet, David J and Kiros, Jamie Ryan and Fidler, Sanja},
booktitle = {Proceedings of the British Machine Vision Conference ({BMVC})},
url = {https://github.com/fartashf/vsepp},
year={2018}
}