KU-BdSL: An open dataset for Bengali sign language recognition

Data Brief. 2023 Nov 11:51:109797. doi: 10.1016/j.dib.2023.109797. eCollection 2023 Dec.

Abstract

Sign language is a form of communication medium for speech and hearing disabled people. It has various forms with different troublesome patterns, which are difficult for the general mass to comprehend. Bengali sign language (BdSL) is one of the difficult sign languages due to its immense number of alphabet, words, and expression techniques. Machine translation can ease the difficulty for disabled people to communicate with generals. From the machine learning (ML) domain, computer vision can be the solution for them, and every ML solution requires a optimized model and a proper dataset. Therefore, in this research work, we have created a BdSL dataset and named `KU-BdSL', which consists of 30 classes describing 38 consonants ('banjonborno') of the Bengali alphabet. The dataset includes 1500 images of hand signs in total, each representing Bengali consonant(s). Thirty-nine participants (30 males and 9 females) of different ages (21-38 years) participated in the creation of this dataset. We adopted smartphones to capture the images due to the availability of their high-definition cameras. We believe that this dataset can be beneficial to the deaf and dumb (D&D) community. Identification of Bengali consonants of BdSL from images or videos is feasible using the dataset. It can also be employed for a human-machine interface for disabled people. In the future, we will work on the vowels and word level of BdSL.

Keywords: Bengali sign language; Computer vision; Deep learning; Machine learning; Sign language recognition.