TextBPN-MLOCR: Advanced Multi-Lingual Scene Text Detection

Enhanced version of TextBPN++ for robust scene text detection across multiple languages and artistic fonts. Trained on large-scale synthetic and real-world text datasets for superior performance in diverse scenarios.

News

2025.06.30 Our model TextBPN-MLOCR achieved first place on ArT2019_ and MLT2019

✨ Key Features

Multi-Lingual Support: Detect text in Arabic, Bangla, Chinese, Japanese, Korean, Latin, Hindi
Artistic Text Handling: Accurately processes stylized and decorative fonts
Optimized Performance: Fully supports modern NVIDIA GPUs
Large-scale Training:
- 🧪 1.5M+ synthetic text samples
- 📸 500K+ real-world text samples

🛠️ Hardware Requirements

Component	Requirement
GPU	NVIDIA GPUs
CUDA	12.2
Python	≥ 3.9
OS	Linux (recommended)

🔽 Model Download

Download pre-trained models from HuggingFace Hub:
https://huggingface.co/somos99/TextBPN-MLOCR

📦 Installation

Install via PyPI:

pip install -r requirements.txt

From DCN with CUDA

sh make.sh

🚀 Quick Start

import datetime
import json
import logging
import torch
from typing import List
import base64
import cv2
import numpy as np
from PIL import Image
from io import BytesIO
from ocr.ocr_detection import FrameOCR
if __name__ == "__main__":
    # model
    torch.cuda.set_device(0)
    model_path = './models/TextBPN_deformable_resnet50_best2.pth'
    detect_model = FrameOCR(model_path, backbone="deformable_resnet50", use_gpu=True, need_layout=True, test_speed=False)
     
    test_img = "test.jpg"
    raw_images = cv2.imread(test_img)
    if len(raw_images.shape) == 2:
        raw_images = cv2.cvtColor(raw_images, cv2.COLOR_GRAY2BGR)
    out_puts = detect_model.detect([raw_images])
    print(out_puts)

🎨Gradio

📖 References

@inproceedings{zhang2021adaptive,
  title={Adaptive boundary proposal network for arbitrary shape text detection},
  author={Zhang, Shi-Xue and Zhu, Xiaobin and Yang, Chun and Wang, Hongfa and Yin, Xu-Cheng},
  booktitle={Proceedings of the IEEE/CVF international conference on computer vision},
  pages={1305--1314},
  year={2021}
}
@article{zhang2023arbitrary,
  title={Arbitrary shape text detection via boundary transformer},
  author={Zhang, Shi-Xue and Yang, Chun and Zhu, Xiaobin and Yin, Xu-Cheng},
  journal={IEEE Transactions on Multimedia},
  volume={26},
  pages={1747--1760},
  year={2023},
  publisher={IEEE}
}

⚖️ License

This project is licensed under the MIT License.

🙏 Acknowledgements

This project extends the original work from:

TextBPN++: GitHub Repository
Contributors to the TextBPN project

Contribute & Support

🌟 Star us on GitHub → https://github.com/somos99/TextBPN-MLOCR
🐛 Report issues → https://github.com/somos99/TextBPN-MLOCR/issues
📥 Pull requests welcome!

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
models		models
ocr		ocr
README.md		README.md
WechatIMG23.jpg		WechatIMG23.jpg
gradio_app.py		gradio_app.py
make.sh		make.sh
requirements.txt		requirements.txt
test.jpg		test.jpg
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TextBPN-MLOCR: Advanced Multi-Lingual Scene Text Detection

News

✨ Key Features

🛠️ Hardware Requirements

🔽 Model Download

📦 Installation

🚀 Quick Start

🎨Gradio

📖 References

⚖️ License

🙏 Acknowledgements

Contribute & Support

About

Uh oh!

Releases

Packages

Uh oh!

Languages

GXYM/TextBPN-MLOCR

Folders and files

Latest commit

History

Repository files navigation

TextBPN-MLOCR: Advanced Multi-Lingual Scene Text Detection

News

✨ Key Features

🛠️ Hardware Requirements

🔽 Model Download

📦 Installation

🚀 Quick Start

🎨Gradio

📖 References

⚖️ License

🙏 Acknowledgements

Contribute & Support​​

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Contribute & Support

Packages