Deep Temporal Model-Based Identity-Aware Hand Detection for Space Human-Robot Interaction

IEEE Trans Cybern. 2022 Dec;52(12):13738-13751. doi: 10.1109/TCYB.2021.3114031. Epub 2022 Nov 18.

Abstract

Hand detection is a crucial technology for space human-robot interaction (SHRI), and the awareness of hand identities is particularly critical. However, most advanced works have three limitations: 1) the low detection accuracy of small-size objects; 2) insufficient temporal feature modeling between frames in videos; and 3) the inability of real-time detection. In the article, a temporal detector (called TA-RSSD) is proposed based on the SSD and spatiotemporal long short-term memory (ST-LSTM) for real-time detection in SHRI applications. Next, based on the online tubelet analysis, a real-time identity-awareness module is designed for multiple hand object identification. Several notable properties are described as follows: 1) the hybrid structure of the Resnet-101 and the SSD improves the detection accuracy of small objects; 2) three-level feature pyramidal structure retains rich semantic information without losing detailed information; 3) a group of the redesigned temporal attentional LSTM (TA-LSTM) is utilized for three-level feature map modeling, which effectively achieves background suppression and scale suppression; 4) low-level attention maps are used to eliminate in-class similarity between hand objects, which improves the accuracy of identity awareness; and 5) a novel association training scheme enhances the temporal coherence between frames. The proposed model is evaluated on the SHRI-VID dataset (collected according to the task requirements), the AU-AIR dataset, and the ImageNet-VID benchmark. Extensive ablation studies and comparisons on detection and identity-awareness capacities show the superiority of the proposed model. Finally, a set of actual testing is conducted on a space robot, and the results show that the proposed model achieves a real-time speed and high accuracy.

MeSH terms

  • Attention
  • Humans
  • Neural Networks, Computer*
  • Robotics*
  • Semantics