You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The official repo for [CVPR'23] "DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting" & [ArXiv'23] "DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text Spotting"
[2025/05/16]: 🚀🚀 We release LogicOCR, a benchmark designed to evaluate the logical reasoning abilities of Large Multimodal Models (LMMs) on text-rich images, while minimizing reliance on domain-specific knowledge. We offer key insights for enhancing multimodal reasoning.
[2024/04/25]: Update DeepSolo models finetuned on BOVText and DSText video datasets.
[2023/06/02]: Update the pre-trained and fine-tuned Chinese scene text spotting model (78.3% 1-NED on ICDAR 2019 ReCTS).
[2023/05/31]: The extension paper (DeepSolo++) is submitted to ArXiv. The code and models will be released soon.
[2023/02/28]: DeepSolo is accepted by CVPR 2023. 🎉🎉
If you find DeepSolo helpful, please consider giving this repo a star ⭐ and citing:
@inproceedings{ye2023deepsolo,
title={DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting},
author={Ye, Maoyuan and Zhang, Jing and Zhao, Shanshan and Liu, Juhua and Liu, Tongliang and Du, Bo and Tao, Dacheng},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={19348--19357},
year={2023}
}
@article{ye2023deepsolo++,
title={DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text Spotting},
author={Ye, Maoyuan and Zhang, Jing and Zhao, Shanshan and Liu, Juhua and Liu, Tongliang and Du, Bo and Tao, Dacheng},
booktitle={arxiv preprint arXiv:2305.19957},
year={2023}
}
Acknowledgement
This project is based on Adelaidet. For academic use, this project is licensed under the 2-clause BSD License.
About
The official repo for [CVPR'23] "DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting" & [ArXiv'23] "DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text Spotting"