A Modular Neural Motion Retargeting System Decoupling Skeleton and Shape Perception

IEEE Trans Pattern Anal Mach Intell. 2024 Apr 10:PP. doi: 10.1109/TPAMI.2024.3386777. Online ahead of print.

Abstract

Motion mapping between characters with different structures but corresponding to homeomorphic graphs, meanwhile preserving motion semantics and perceiving shape geometries, poses significant challenges in skinned motion retargeting. We propose M-R2ET, a modular neural motion retargeting system to comprehensively address these challenges. The key insight driving M-R2ET is its capacity to learn residual motion modifications within a canonical skeleton space. Specifically, a cross-structure alignment module is designed to learn joint correspondences among diverse skeletons, enabling motion copy and forming a reliable initial motion for semantics and geometry perception. Besides, two residual modification modules, i.e., the skeleton-aware module and shape-aware module, preserving source motion semantics and perceiving target character geometries, effectively reduce interpenetration and contact-missing. Driven by our distance-based losses that explicitly model the semantics and geometry, these two modules learn residual motion modifications to the initial motion in a single inference without post-processing. To balance these two motion modifications, we further present a balancing gate to conduct linear interpolation between them. Extensive experiments on the public dataset Mixamo demonstrate that our M-R2ET achieves the state-of-the-art performance, enabling cross-structure motion retargeting, and providing a good balance among the preservation of motion semantics as well as the attenuation of interpenetration and contact-missing.