Robust Optical Recognition of Cursive Pashto Script Using Scale, Rotation and Location Invariant Approach

PLoS One. 2015 Sep 14;10(9):e0133648. doi: 10.1371/journal.pone.0133648. eCollection 2015.

Abstract

The presence of a large number of unique shapes called ligatures in cursive languages, along with variations due to scaling, orientation and location provides one of the most challenging pattern recognition problems. Recognition of the large number of ligatures is often a complicated task in oriental languages such as Pashto, Urdu, Persian and Arabic. Research on cursive script recognition often ignores the fact that scaling, orientation, location and font variations are common in printed cursive text. Therefore, these variations are not included in image databases and in experimental evaluations. This research uncovers challenges faced by Arabic cursive script recognition in a holistic framework by considering Pashto as a test case, because Pashto language has larger alphabet set than Arabic, Persian and Urdu. A database containing 8000 images of 1000 unique ligatures having scaling, orientation and location variations is introduced. In this article, a feature space based on scale invariant feature transform (SIFT) along with a segmentation framework has been proposed for overcoming the above mentioned challenges. The experimental results show a significantly improved performance of proposed scheme over traditional feature extraction techniques such as principal component analysis (PCA).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Language
  • Middle East
  • Pattern Recognition, Automated / methods*
  • Writing*

Grants and funding

This work is sponsored by Higher Education Commission of Pakistan and Shaheed Benazir Bhutto University, Sheringal Dir Pakistan under Award Letter No: SBBU/Estb/Ord/13- 315. Genie Technologies (Pvt) Ltd provided support in the form of a salary for author SHA, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. Sayed Hassan Amin has contributed voluntarily in his private capacity. The specific roles of all authors are articulated in the “author contributions” section.