TLFS23 Tamil language fingerspelling dataset

Bavesh Ram S; Chirranjeavi M; Aaruran S; Gokulraj Va; Binoy B Nair; Harikumar M E

doi:10.1016/j.dib.2023.109961

TLFS23 Tamil language fingerspelling dataset

Data Brief. 2023 Dec 15:52:109961. doi: 10.1016/j.dib.2023.109961. eCollection 2024 Feb.

Authors

Bavesh Ram S¹, Chirranjeavi M¹, Aaruran S¹, Gokulraj Va¹, Binoy B Nair¹, Harikumar M E¹

Affiliation

¹ Department of Electronics and Communication Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India.

Abstract

Tamil is one of the oldest existing languages, spoken by around 65 million people across India, Sri Lanka and South-East Asia. Countries such as Fiji and South Africa also have a significant population with Tamil ancestry. Tamil is a complex language and has 247 characters. A labelled dataset for Tamil Fingerspelling named TLFS23 has been created for research related to vision-based Fingerspelling translators for the Speech and hearing Impaired. The dataset would open up avenues to develop automated systems as translators and interpreters for effective communication between fingerspelling language users and non- users, using computer vision and deep learning algorithms. One thousand images representing each unique finger flexion motion for every Tamil character was collected overall constituting a large dataset with 248 classes with a total of 2,55,155 images. The images were contributed by 120 individuals from different age groups. The dataset is made publicly available at: https://data.mendeley.com/datasets/39kzs5pxmk/2.

Keywords: Computer vision; Image dataset; Indian sign language; Tamil.