Spatial Attention-Based 3D Graph Convolutional Neural Network for Sign Language Recognition

Sensors (Basel). 2022 Jun 16;22(12):4558. doi: 10.3390/s22124558.

Abstract

Sign language is the main channel for hearing-impaired people to communicate with others. It is a visual language that conveys highly structured components of manual and non-manual parameters such that it needs a lot of effort to master by hearing people. Sign language recognition aims to facilitate this mastering difficulty and bridge the communication gap between hearing-impaired people and others. This study presents an efficient architecture for sign language recognition based on a convolutional graph neural network (GCN). The presented architecture consists of a few separable 3DGCN layers, which are enhanced by a spatial attention mechanism. The limited number of layers in the proposed architecture enables it to avoid the common over-smoothing problem in deep graph neural networks. Furthermore, the attention mechanism enhances the spatial context representation of the gestures. The proposed architecture is evaluated on different datasets and shows outstanding results.

Keywords: attention; deep learning; graph convolutional neural network (GCN); sign language recognition.

MeSH terms

  • Gestures
  • Humans
  • Language
  • Neural Networks, Computer*
  • Recognition, Psychology
  • Sign Language*