Spatial-temporal human gesture recognition under degraded conditions using three-dimensional integral imaging

Opt Express. 2018 May 28;26(11):13938-13951. doi: 10.1364/OE.26.013938.

Abstract

We present spatial-temporal human gesture recognition in degraded conditions including low light levels and occlusions using passive sensing three-dimensional (3D) integral imaging (InIm) system and 3D correlation filters. The 4D (lateral, longitudinal, and temporal) reconstructed data is processed using a variety of algorithms including linear and non-linear distortion-invariant filters; and compared with previously reported space-time interest points (STIP) feature detector, 3D histogram of oriented gradients (3D HOG) feature descriptor, with a standard bag-of-features support vector machine (SVM) framework, etc. The gesture recognition results with different classification algorithms are compared using a variety of performance metrics such as receiver operating characteristic (ROC) curves, area under the curve (AUC), SNR, the probability of classification errors, and confusion matrix. Integral imaging video sequences of human gestures are captured under degraded conditions such as low light illumination and in the presence of partial occlusions. A four-dimensional (4D) reconstructed video sequence is computed that provides lateral and depth information of a scene over time i.e. (x, y, z, t). The total-variation denoising algorithm is applied to the signal to further reduce noise and preserve data in the video frames. We show that the 4D signal consists of decreased scene noise, partial occlusion removal, and improved SNR due to the computational InIm and/or denoising algorithms. Finally, gesture recognition is processed with classification algorithms, such as distortion-invariant correlation filters; and STIP, 3D HOG with SVM, which are applied to the reconstructed 4D gesture signal to classify the human gesture. Experiments are conducted using a synthetic aperture InIm system in ambient light. Our experiments indicate that the proposed approach is promising in detection of human gestures in degraded conditions such as low illumination conditions with partial occlusion. To the best of our knowledge, this is the first report on spatial-temporal human gesture recognition in degraded conditions using passive sensing 4D integral imaging with nonlinear correlation filters.

MeSH terms

  • Algorithms
  • Gestures*
  • Humans
  • Image Interpretation, Computer-Assisted / methods*
  • Imaging, Three-Dimensional / methods*
  • Light*
  • Pattern Recognition, Automated / methods*
  • ROC Curve
  • Spatio-Temporal Analysis
  • Support Vector Machine