Deep Learning Technology to Recognize American Sign Language Alphabet

Sensors (Basel). 2023 Sep 19;23(18):7970. doi: 10.3390/s23187970.

Abstract

Historically, individuals with hearing impairments have faced neglect, lacking the necessary tools to facilitate effective communication. However, advancements in modern technology have paved the way for the development of various tools and software aimed at improving the quality of life for hearing-disabled individuals. This research paper presents a comprehensive study employing five distinct deep learning models to recognize hand gestures for the American Sign Language (ASL) alphabet. The primary objective of this study was to leverage contemporary technology to bridge the communication gap between hearing-impaired individuals and individuals with no hearing impairment. The models utilized in this research include AlexNet, ConvNeXt, EfficientNet, ResNet-50, and VisionTransformer were trained and tested using an extensive dataset comprising over 87,000 images of the ASL alphabet hand gestures. Numerous experiments were conducted, involving modifications to the architectural design parameters of the models to obtain maximum recognition accuracy. The experimental results of our study revealed that ResNet-50 achieved an exceptional accuracy rate of 99.98%, the highest among all models. EfficientNet attained an accuracy rate of 99.95%, ConvNeXt achieved 99.51% accuracy, AlexNet attained 99.50% accuracy, while VisionTransformer yielded the lowest accuracy of 88.59%.

Keywords: AlexNet; American sign language; ConvNeXt; EfficientNet; ResNet-50; VisionTransformer; deep learning; image-based; transfer learning.

MeSH terms

  • Deep Learning*
  • Gestures
  • Humans
  • Quality of Life
  • Sign Language*
  • Technology
  • United States

Grants and funding

This research received no external funding.