Writer verification of partially damaged handwritten Arabic documents based on individual character shapes

Majid A Khan; Nazeeruddin Mohammad; Ghassen Ben Brahim; Abul Bashar; Ghazanfar Latif

doi:10.7717/peerj-cs.955

Writer verification of partially damaged handwritten Arabic documents based on individual character shapes

PeerJ Comput Sci. 2022 Apr 20:8:e955. doi: 10.7717/peerj-cs.955. eCollection 2022.

Authors

Majid A Khan¹, Nazeeruddin Mohammad¹, Ghassen Ben Brahim¹, Abul Bashar¹, Ghazanfar Latif¹

Affiliation

¹ College of Computer Engineering and Science, Prince Mohammad Bin Fahd University, Khobar, Eastern Province, Saudi Arabia.

Abstract

Author verification of handwritten text is required in several application domains and has drawn a lot of attention within the research community due to its importance. Though, several approaches have been proposed for the text-independent writer verification of handwritten text, none of these have addressed the problem domain where author verification is sought based on partially-damaged handwritten documents (e.g., during forensic analysis). In this paper, we propose an approach for offline text-independent writer verification of handwritten Arabic text based on individual character shapes (within the Arabic alphabet). The proposed approach enables writer verification for partially damaged documents where certain handwritten characters can still be extracted from the damaged document. We also provide a mechanism to identify which Arabic characters are more effective during the writer verification process. We have collected a new dataset, Arabic Handwritten Alphabet, Words and Paragraphs Per User (AHAWP), for this purpose in a classroom setting with 82 different users. The dataset consists of 53,199 user-written isolated Arabic characters, 8,144 Arabic words, 10,780 characters extracted from these words. Convolutional neural network (CNN) based models are developed for verification of writers based on individual characters with an accuracy of 94% for isolated character shapes and 90% for extracted character shapes. Our proposed approach provided up to 95% writer verification accuracy for partially damaged documents.

Keywords: Convolutional Neural Networks (CNN); Writer verification based on character shapes; Writer verification of partially damaged arabic documents.

Associated data

figshare/10.6084/m9.figshare.17068562.v1

Grants and funding

The authors received no funding for this work.