Stroke Sequence-Dependent Deep Convolutional Neural Network for Online Handwritten Chinese Character Recognition

IEEE Trans Neural Netw Learn Syst. 2020 Nov;31(11):4637-4648. doi: 10.1109/TNNLS.2019.2956965. Epub 2020 Oct 29.

Abstract

We propose a novel model, called stroke sequence-dependent deep convolutional neural network (SSDCNN), which uses the stroke sequence information and eight-directional features of Chinese characters for online handwritten Chinese character recognition (OLHCCR). SSDCNN learns the representation of OLHCCs by incorporating the natural sequence information of the strokes. Furthermore, it naturally incorporates the eight-directional features. First, SSDCNN inputs the stroke sequence and transforms it into stacks of feature maps following the writing order of the strokes. Second, the fixed-length, stroke sequence-dependent representations of OLHCC are derived through convolutional, residual, and max-pooling operations. Third, the stroke sequence-dependent representation is combined with the eight-directional features via a number of fully connected neural network layers. Finally, the Chinese characters are recognized using a softmax classifier. The SSDCNN is trained in two stages: 1) the whole architecture is pretrained using the training data until the performance converges to an acceptable degree. 2) The stroke sequence-dependent representation is combined with the eight-directional features by a fully connected neural network and a softmax layer for further training. The model was experimentally evaluated on the OLHCCR competition tasks of International Conference on Document Analysis and Recognition (ICDAR) 2013. The recognition error was a maximum 58.28% lower in SSDCNN than in a model using the eight-directional features alone (5.13% versus 2.14%). Owing to its high accuracy (97.86%), the proposed SSDCNN reduced the recognition error by approximately 18.0% as compared with that of the winning system in the ICDAR 2013 competition. SSDCNN integrated with an adaptation mechanism, called the SSDCNN+Adapt model, and reached a new state-of-the-art (SOTA) standard with an accuracy of 97.94%. The SSDCNN exploits the stroke sequence information to learn high-quality OLHCC representations. Moreover, the learned representation and the classical eight-directional features complement each other within the SSDCNN architecture.

Publication types

  • Research Support, Non-U.S. Gov't