Pashto Handwritten Invariant Character Trajectory Prediction Using a Customized Deep Learning Technique

Sensors (Basel). 2023 Jun 30;23(13):6060. doi: 10.3390/s23136060.

Abstract

Before the 19th century, all communication and official records relied on handwritten documents, cherished as valuable artefacts by different ethnic groups. While significant efforts have been made to automate the transcription of major languages like English, French, Arabic, and Chinese, there has been less research on regional and minor languages, despite their importance from geographical and historical perspectives. This research focuses on detecting and recognizing Pashto handwritten characters and ligatures, which is essential for preserving this regional cursive language in Pakistan and its status as the national language of Afghanistan. Deep learning techniques were employed to detect and recognize Pashto characters and ligatures, utilizing a newly developed dataset specific to Pashto. A further enhancement was done on the dataset by implementing data augmentation, i.e., scaling and rotation on Pashto handwritten characters and ligatures, which gave us many variations of a single trajectory. Different morphological operations for minimizing gaps in the trajectories were also performed. The median filter was used for the removal of different noises. This dataset will be combined with the existing PHWD-V2 dataset. Various deep-learning techniques were evaluated, including VGG19, MobileNetV2, MobileNetV3, and a customized CNN. The customized CNN demonstrated the highest accuracy and minimal loss, achieving a training accuracy of 93.98%, validation accuracy of 92.08% and testing accuracy of 92.99%.

Keywords: Pashto handwriting trajectories; and recognition; customized CNN; enhance PHWD-V2 dataset; prediction.

MeSH terms

  • Deep Learning*
  • Handwriting
  • Humans
  • Language
  • Neural Networks, Computer*
  • Pattern Recognition, Automated / methods