Global Co-Occurrence Feature and Local Spatial Feature Learning for Skeleton-Based Action Recognition

Jun Xie; Wentian Xin; Ruyi Liu; Qiguang Miao; Lijie Sheng; Liang Zhang; Xuesong Gao

doi:10.3390/e22101135

Global Co-Occurrence Feature and Local Spatial Feature Learning for Skeleton-Based Action Recognition

Entropy (Basel). 2020 Oct 6;22(10):1135. doi: 10.3390/e22101135.

Authors

Jun Xie¹, Wentian Xin¹, Ruyi Liu¹, Qiguang Miao¹, Lijie Sheng¹, Liang Zhang¹, Xuesong Gao²

Affiliations

¹ School of Computer Science and Technology, Xidian University, Xi'an 710071, China.
² State Key Laboratory of Digital Multimedia Technology, Hisense Co., Ltd., Qingdao 266071, China.

Abstract

Recent progress on skeleton-based action recognition has been substantial, benefiting mostly from the explosive development of Graph Convolutional Networks (GCN). However, prevailing GCN-based methods may not effectively capture the global co-occurrence features among joints and the local spatial structure features composed of adjacent bones. They also ignore the effect of channels unrelated to action recognition on model performance. Accordingly, to address these issues, we propose a Global Co-occurrence feature and Local Spatial feature learning model (GCLS) consisting of two branches. The first branch, based on the Vertex Attention Mechanism branch (VAM-branch), captures the global co-occurrence feature of actions effectively; the second, based on the Cross-kernel Feature Fusion branch (CFF-branch), extracts local spatial structure features composed of adjacent bones and restrains the channels unrelated to action recognition. Extensive experiments on two large-scale datasets, NTU-RGB+D and Kinetics, demonstrate that GCLS achieves the best performance when compared to the mainstream approaches.

Keywords: feature fusion; graph convolutional network; skeleton-based action recognition.