LHPE-nets: A lightweight 2D and 3D human pose estimation model with well-structural deep networks and multi-view pose sample simplification method

Hao Wang; Ming-Hui Sun; Hao Zhang; Li-Yan Dong

doi:10.1371/journal.pone.0264302

LHPE-nets: A lightweight 2D and 3D human pose estimation model with well-structural deep networks and multi-view pose sample simplification method

PLoS One. 2022 Feb 23;17(2):e0264302. doi: 10.1371/journal.pone.0264302. eCollection 2022.

Authors

Hao Wang^{1

2}, Ming-Hui Sun^{1

2}, Hao Zhang¹, Li-Yan Dong^{1

2}

Affiliations

¹ College of Computer Science and Technology, Jilin University, Changchun, China.
² Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China.

Abstract

The cross-view 3D human pose estimation model has made significant progress, it better completed the task of human joint positioning and skeleton modeling in 3D through multi-view fusion method. The multi-view 2D pose estimation part of this model is very important, but its training cost is also very high. It uses some deep learning networks to generate heatmaps for each view. Therefore, in this article, we tested some new deep learning networks for pose estimation tasks. These deep networks include Mobilenetv2, Mobilenetv3, Efficientnetv2 and Resnet. Then, based on the performance and drawbacks of these networks, we built multiple deep learning networks with better performance. We call our network in this article LHPE-nets, which mainly includes Low-Span network and RDNS network. LHPE-nets uses a network structure with evenly distributed channels, inverted residuals, external residual blocks and a framework for processing small-resolution samples to achieve training saturation faster. And we also designed a static pose sample simplification method for 3D pose data. It implemented low-cost sample storage, and it was also convenient for models to read these samples. In the experiment, we used several recent models and two public estimation indicators. The experimental results show the superiority of this work in fast start-up and network lightweight, it is about 1-5 epochs faster than the Resnet-34 during training. And they also show the accuracy improvement of this work in estimating different joints, the estimated performance of approximately 60% of the joints is improved. Its performance in the overall human pose estimation exceeds other networks by more than 7mm. The experiment analyzes the network size, fast start-up and the performance in 2D and 3D pose estimation of the model in this paper in detail. Compared with other pose estimation models, its performance has also reached a higher level of application.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Biometric Identification / methods*
Deep Learning*
Humans
Imaging, Three-Dimensional / methods*
Posture*

Grants and funding

This work was supported by the National Natural Science Foundation of China (Grant Nos. 61272209, 61872164), in part by the Program of Science and Technology Development Plan of Jilin Province of China under Grant 20190302032GX, and in part by the Fundamental Research Funds for the Central Universities (Jilin University). Grant Recipient:Ming-hui Sun.