A real-time arbitrary-shape text detector

Manhuai Lu; Langlang Li; Chin-Ling Chen

doi:10.1371/journal.pone.0302234

A real-time arbitrary-shape text detector

PLoS One. 2024 Apr 16;19(4):e0302234. doi: 10.1371/journal.pone.0302234. eCollection 2024.

Authors

Manhuai Lu¹, Langlang Li², Chin-Ling Chen^{3

4}

Affiliations

¹ College of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Zhongshan Institute, Zhongshan, China.
² School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China.
³ School of Information Engineering, Changchun Sci-Tech University, Changchun, China.
⁴ Department of Computer Science and Information Engineering, Chaoyang University of Technology, Taichung, Taiwan.

Abstract

It is challenging to detect arbitrary-shape text accurately and effectively in natural scenes. While many methods have been implemented for arbitrary-shape text detection, most cannot achieve real-time detection or meet practical needs. In this work, we propose a YOLOv6-based detector that can effectively implement arbitrary-shape text detection and achieve real-time detection. We include two additional branches in the neck part of the YOLOv6 network to adapt the network to text detection, and the output side uses the pixel aggregation (PA) algorithm to decouple the PA output to use it as the detection head of the model. Experiments on benchmark Total-Text, CTW1500, ICDAR2015, and MSRA-TD500 showed that the proposed method outperformed competing methods in terms of detection accuracy and running time. Specifically, our method achieved an F-measure of 84.1% at 291.8 FPS for 640 × 640 Total-Text images and an F-measure of 81.5% at 199.6 FPS for 896 × 896 ICDAR2015 incidental text images.

Copyright: © 2024 Lu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

MeSH terms

Algorithms*
Neural Networks, Computer*

Grants and funding

This research was funded by the National Social Science Fund of China, grant number(20BGL141).