A Novel Text Detection System Based on Character and Link Energies

IEEE Trans Image Process. 2014 Sep;23(9):4187-4098. doi: 10.1109/TIP.2014.2341935. Epub 2014 Jul 23.

Abstract

We propose a novel method by using three new character features to detect text objects comprising two or more isolated characters in images and videos. A new text model is constructed to describe text objects. Each character is a part in the model and every two neighboring characters are connected by a link. Two characters and the link connecting them are defined as a text unit. For every candidate part, we compute character energy based on our observation that each character stroke forms two edges with high similarities in length, curvature, and orientation. For every candidate link, we compute link energy based on the similarities in color, size, stroke width, and spacing between characters that are aligned along a particular direction. For every candidate text unit, we combine character and link energies to compute text unit energy which measures the likelihood that the candidate is a text object. We evaluated the performance of the proposed method on ICDAR 2003/2005 dataset, Microsoft Street view dataset, and VACE video dataset. The experimental results demonstrate that our method can capture the inherent properties of characters and discriminate text from other objects effectively.