Patch Attention Layer of Embedding Handcrafted Features in CNN for Facial Expression Recognition

Sensors (Basel). 2021 Jan 27;21(3):833. doi: 10.3390/s21030833.

Abstract

Recognizing facial expression has attracted much more attention due to its broad range of applications in human-computer interaction systems. Although facial representation is crucial to final recognition accuracy, traditional handcrafted representations only reflect shallow characteristics and it is uncertain whether the convolutional layer can extract better ones. In addition, the policy that weights are shared across a whole image is improper for structured face images. To overcome such limitations, a novel method based on patches of interest, the Patch Attention Layer (PAL) of embedding handcrafted features, is proposed to learn the local shallow facial features of each patch on face images. Firstly, a handcrafted feature, Gabor surface feature (GSF), is extracted by convolving the input face image with a set of predefined Gabor filters. Secondly, the generated feature is segmented as nonoverlapped patches that can capture local shallow features by the strategy of using different local patches with different filters. Then, the weighted shallow features are fed into the remaining convolutional layers to capture high-level features. Our method can be carried out directly on a static image without facial landmark information, and the preprocessing step is very simple. Experiments on four databases show that our method achieved very competitive performance (Extended Cohn-Kanade database (CK+): 98.93%; Oulu-CASIA: 97.57%; Japanese Female Facial Expressions database (JAFFE): 93.38%; and RAF-DB: 86.8%) compared to other state-of-the-art methods.

Keywords: convolutional layer; facial expression recognition; facial representation; feature extraction; patch attention; shallow feature.

MeSH terms

  • Databases, Factual
  • Face
  • Facial Expression
  • Facial Recognition*
  • Female
  • Humans