You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Nov 1, 2024. It is now read-only.
Going Denser with Open-Vocabulary Part Segmentation
Object detection has been expanded from a limited number of categories to open vocabulary.
Moving forward, a complete intelligent vision system requires understanding more fine-grained object descriptions, object parts.
In this work, we propose a detector with the ability to predict both open-vocabulary objects and their part segmentation.
This ability comes from two designs:
We train the detector on the joint of part-level, object-level and image-level data.
We parse the novel object into its parts by its dense semantic correspondence with the base object.
We provide a large set of baseline results and trained models in the Model Zoo.
License
The majority of this project is licensed under a MIT License. Portions of the project are available under separate license of referred projects, including CLIP, Detic and dino-vit-features. Many thanks for their wonderful works.
Citation
If you use VLPart in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:
@article{peize2023vlpart,
title = {Going Denser with Open-Vocabulary Part Segmentation},
author = {Sun, Peize and Chen, Shoufa and Zhu, Chenchen and Xiao, Fanyi and Luo, Ping and Xie, Saining and Yan, Zhicheng},
journal = {arXiv preprint arXiv:2305.11173},
year = {2023}
}