Pose-Aware 3D Talking Face Synthesis using Geometry-guided Audio-Vertices Attention

Bo Li; Xiaolin Wei; Bin Liu; Zhifen He; Junjie Cao; Yu-Kun Lai

doi:10.1109/TVCG.2024.3371064

Pose-Aware 3D Talking Face Synthesis using Geometry-guided Audio-Vertices Attention

IEEE Trans Vis Comput Graph. 2024 Feb 28:PP. doi: 10.1109/TVCG.2024.3371064. Online ahead of print.

Authors

Bo Li, Xiaolin Wei, Bin Liu, Zhifen He, Junjie Cao, Yu-Kun Lai

PMID: 38416616
DOI: 10.1109/TVCG.2024.3371064

Abstract

Most of the existing 3D talking face synthesis methods suffer from the lack of detailed facial expressions and realistic head poses, resulting in unsatisfactory experiences for users. In this paper, we propose a novel pose-aware 3D talking face synthesis method with a novel geometry-guided audio-vertices attention. To capture more detailed expression, such as the subtle nuances of mouth shape and eye movement, we propose to build hierarchical audio features including a global attribute feature and a series of vertex-wise local latent movement features. Then, in order to fully exploit the topology of facial models, we further propose a novel geometry-guided audio-vertices attention module to predict the displacement of each vertex by using vertex connectivity relations to take full advantage of the corresponding hierarchical audio features. Finally, to accomplish pose-aware animation, we expand the existing database with an additional pose attribute, and a novel pose estimation module is proposed by paying attention to the whole head model. Numerical experiments demonstrate the effectiveness of the proposed method on realistic expression and head movements against state-of-the-art methods.