Tamil handwritten palm leaf manuscript dataset (THPLMD)

Data Brief. 2024 Jan 30:53:110100. doi: 10.1016/j.dib.2024.110100. eCollection 2024 Apr.

Abstract

Most palm leaf manuscripts are generally accessible in deteriorated condition, including cracks, discoloration, moisture and humidity, and insects bite. Such a manuscript is considered challenging in the research field. We captured deteriorated Tamil palm leaves around 262 dataset samples are 'Naladiyar(27)',' Tholkappiyam(221)', and' Thirikadugam(14)' which are genned up mortal health, discipline, authoritative text on Tamil grammar. We contribute the high-quality raw dataset with the aid of a Nikon camera, pre-enhance samples by editing software tool, and applied the Otsu threshold to deliver the ground images through binarization as readily accessible content presenting a highly time-consuming task to play a vital role in Machine/Deep/ Transfer learning, AI, and ANN.

Keywords: Binarization; Enhancement; Ground truth; Otsu; Photoshop; Segmentation; Tamil palm leaf.