AMHGCN: Adaptive multi-level hypergraph convolution network for human motion prediction

Jinkai Li; Jinghua Wang; Lian Wu; Xin Wang; Xiaoling Luo; Yong Xu

doi:10.1016/j.neunet.2024.106153

AMHGCN: Adaptive multi-level hypergraph convolution network for human motion prediction

Neural Netw. 2024 Apr:172:106153. doi: 10.1016/j.neunet.2024.106153. Epub 2024 Jan 29.

Authors

Jinkai Li¹, Jinghua Wang¹, Lian Wu², Xin Wang¹, Xiaoling Luo³, Yong Xu⁴

Affiliations

¹ School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Shenzhen 518055, China.
² School of Mathematics and Big Data, GuiZhou Education University, Guiyang 550018, China.
³ College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China.
⁴ School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Shenzhen 518055, China; Peng Cheng Laboratory, Shenzhen 518055, China. Electronic address: laterfall@hit.edu.cn.

PMID: 38306784
DOI: 10.1016/j.neunet.2024.106153

Abstract

Human motion prediction is the key technology for many real-life applications, e.g., self-driving and human-robot interaction. The recent approaches adopt the unrestricted full-connection graph representation to capture the relationships inside the human skeleton. However, there are two issues to be solved: (i) these unrestricted full-connection graph representation methods neglect the inherent dependencies across the joints of the human body; (ii) these methods represent human motions using the features extracted from a single level and thus can neither fully exploit the various connection relationships among the human body nor guarantee the human motion prediction results to be reasonable. To tackle the above issues, we propose an adaptive multi-level hypergraph convolution network (AMHGCN), which uses the adaptive multi-level hypergraph representation to capture various dependencies among the human body. Our method has four different levels of hypergraph representations, including (i) the joint-level hypergraph representation to capture inherent kinetic dependencies in the human body, (ii) the part-level hypergraph representation to exploit the kinetic characteristics at a higher level (in comparison to the joint-level) by viewing some part of the human body as an entirety, (iii) the component-level hypergraph representation to model the semantic information, and (iv) the global-level hypergraph representation to extract long-distance dependencies in the human body. In addition, to take full advantage of the knowledge carried in the training data, we propose a reverse loss (i.e., adopting the future human poses to predict the historical poses reversely) to realize data augmentation. Extensive experiments show that our proposed AMHGCN can achieve state-of-the-art performance on three benchmarks, i.e., Human3.6M, CMU-Mocap, and 3DPW.

Keywords: Graph convolutional network; Human motion prediction; Hypergraph representation.

MeSH terms

Benchmarking*
Humans
Knowledge*
Motion
Semantics