Deep learning-based recognition system for pashto handwritten text: benchmark on PHTI

Ibrar Hussain; Riaz Ahmad; Khalil Ullah; Siraj Muhammad; Rasha Elhassan; Ikram Syed

doi:10.7717/peerj-cs.1925

Deep learning-based recognition system for pashto handwritten text: benchmark on PHTI

PeerJ Comput Sci. 2024 Mar 27:10:e1925. doi: 10.7717/peerj-cs.1925. eCollection 2024.

Authors

Ibrar Hussain^{1

2}, Riaz Ahmad¹, Khalil Ullah³, Siraj Muhammad¹, Rasha Elhassan⁴, Ikram Syed⁵

Affiliations

¹ Department of Computer Science, Shaheed Benazir Bhutto University, Sheringel, Dir, Pakistan.
² Department of Computer Science & IT, University of Malakand, Chakdara, Pakistan.
³ Department of Software Engineering, University of Malakand, Chakadara, Pakistan.
⁴ Department of Computer Science, King Khalid University, Abha, Saudi Arabia.
⁵ AI and Software, Gachon University, Seongnam-si, Republic of South Korea.

Abstract

This article introduces a recognition system for handwritten text in the Pashto language, representing the first attempt to establish a baseline system using the Pashto Handwritten Text Imagebase (PHTI) dataset. Initially, the PHTI dataset underwent pre-processed to eliminate unwanted characters, subsequently, the dataset was divided into training 70%, validation 15%, and test sets 15%. The proposed recognition system is based on multi-dimensional long short-term memory (MD-LSTM) networks. A comprehensive empirical analysis was conducted to determine the optimal parameters for the proposed MD-LSTM architecture; Counter experiments were used to evaluate the performance of the proposed system comparing with the state-of-the-art models on the PHTI dataset. The novelty of our proposed model, compared to other state of the art models, lies in its hidden layer size (i.e., 10, 20, 80) and its Tanh layer size (i.e., 20, 40). The system achieves a Character Error Rate (CER) of 20.77% as a baseline on the test set. The top 20 confusions are reported to check the performance and limitations of the proposed model. The results highlight complications and future perspective of the Pashto language towards the digital transition.

Keywords: Deep learning; Natural language processing; Optical character recognition; Pashto handwritten text imagebase.

Grants and funding

This work was supported by the King Khalid University Deanship of Scientific Research through the General Research Project under grant number (GRP/172/44/1444). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.