A unified framework for multioriented text detection and recognition

Cong Yao; Xiang Bai; Wenyu Liu

doi:10.1109/TIP.2014.2353813

A unified framework for multioriented text detection and recognition

IEEE Trans Image Process. 2014 Nov;23(11):4737-49. doi: 10.1109/TIP.2014.2353813. Epub 2014 Sep 4.

Authors

Cong Yao, Xiang Bai, Wenyu Liu

PMID: 25203989
DOI: 10.1109/TIP.2014.2353813

Abstract

High level semantics embodied in scene texts are both rich and clear and thus can serve as important cues for a wide range of vision applications, for instance, image understanding, image indexing, video search, geolocation, and automatic navigation. In this paper, we present a unified framework for text detection and recognition in natural images. The contributions of this paper are threefold: 1) text detection and recognition are accomplished concurrently using exactly the same features and classification scheme; 2) in contrast to methods in the literature, which mainly focus on horizontal or near-horizontal texts, the proposed system is capable of localizing and reading texts of varying orientations; and 3) a new dictionary search method is proposed, to correct the recognition errors usually caused by confusions among similar yet different characters. As an additional contribution, a novel image database with texts of different scales, colors, fonts, and orientations in diverse real-world scenarios, is generated and released. Extensive experiments on standard benchmarks as well as the proposed database demonstrate that the proposed system achieves highly competitive performance, especially on multioriented texts.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Artificial Intelligence*
Documentation / classification*
Image Enhancement / methods
Image Interpretation, Computer-Assisted / methods*
Natural Language Processing*
Pattern Recognition, Automated / methods*
Photography / methods
Reading
Reproducibility of Results
Sensitivity and Specificity
Writing*