FineStyle: Semantic-Aware Fine-Grained Motion Style Transfer with Dual Interactive-Flow Fusion

IEEE Trans Vis Comput Graph. 2023 Nov;29(11):4361-4371. doi: 10.1109/TVCG.2023.3320216. Epub 2023 Nov 2.

Abstract

We present FineStyle, a novel framework for motion style transfer that generates expressive human animations with specific styles for virtual reality and vision fields. It incorporates semantic awareness, which improves motion representation and allows for precise and stylish animation generation. Existing methods for motion style transfer have all failed to consider the semantic meaning behind the motion, resulting in limited controls over the generated human animations. To improve, FineStyle introduces a new cross-modality fusion module called Dual Interactive-Flow Fusion (DIFF). As the first attempt, DIFF integrates motion style features and semantic flows, producing semantic-aware style codes for fine-grained motion style transfer. FineStyle uses an innovative two-stage semantic guidance approach that leverages semantic clues to enhance the discriminative power of both semantic and style features. At an early stage, a semantic-guided encoder introduces distinct semantic clues into the style flow. Then, at a fine stage, both flows are further fused interactively, selecting the matched and critical clues from both flows. Extensive experiments demonstrate that FineStyle outperforms state-of-the-art methods in visual quality and controllability. By considering the semantic meaning behind motion style patterns, FineStyle allows for more precise control over motion styles. Source code and model are available on https://github.com/XingliangJin/Fine-Style.git.