PHND: Pashtu Handwritten Numerals Database and deep learning benchmark

PLoS One. 2020 Sep 2;15(9):e0238423. doi: 10.1371/journal.pone.0238423. eCollection 2020.

Abstract

In this paper we introduce a real Pashtu handwritten numerals dataset (PHND) having 50,000 scanned images and make publicly available for research and scientific use. Although more than fifty million people in the world use this language for written and oral communication, no significant efforts are devoted to the Pashtu Optical Character Recognition (POCR). We present a new approach for Pahstu handwritten numerals recognition (PHNR) based on deep neural networks. We train Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) on high-frequency numerals for feature extraction and classification. We evaluated the performance of the proposed algorithm on the newly introduced Pashtu handwritten numerals database PHND and Bangla language number database CMATERDB 3.1.1. We obtained best recognition rate of 98.00% and 98.64% on PHND and CMATERDB 3.1.1. respectively.

Publication types

  • Dataset
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Algorithms
  • Deep Learning
  • Female
  • Handwriting
  • Humans
  • Image Processing, Computer-Assisted / methods*
  • Male
  • Middle Aged
  • Neural Networks, Computer
  • Pattern Recognition, Automated / methods*
  • Writing / standards*

Grants and funding

This research was supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2020-2018-0-01431) supervised by the IITP (Institute for Information & communications Technology Promotion.