You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ReVOS
├── JPEGImages # JPEGImages contains all images of the train set and valid set
│ ├── <video1 >
│ ├── <video2 >
│ └── <video...>
├── mask_dict.json # mask dict (train set & valid set)
├── mask_dict_foreground.json # foreground mask dict (train set & valid set)
├── meta_expressions_train_.json # meta expressions (train set)
└── meta_expressions_valid_.json # meta expressions (valid set)
Acknowledgement
ReVOS largely borrows videos from MOSE, MeViS, LV-VIS, OVIS, TAO and UVO.
The copyright for these videos is held by the organizers of MOSE, MeViS, LV-VIS, OVIS, TAO and UVO.
BibTeX
Please consider to cite VISA if it helps your research.
@article{yan2024visa,
title={VISA: Reasoning Video Object Segmentation via Large Language Models},
author={Yan, Cilin and Wang, Haochen and Yan, Shilin and Jiang, Xiaolong and Hu, Yao and Kang, Guoliang and Xie, Weidi and Gavves, Efstratios},
journal={arXiv preprint arXiv:2407.11325},
year={2024}
}
License
ReVOS is licensed under a CC BY-NC-SA 4.0 License. The data of ReVOS is released for non-commercial research purpose only.
About
[ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model