Robust Facial Feature Tracking Using Shape-Constrained Multiresolution-Selected Linear Predictors

Eng-Jon Ong; Richard Bowden

doi:10.1109/TPAMI.2010.205

Robust Facial Feature Tracking Using Shape-Constrained Multiresolution-Selected Linear Predictors

IEEE Trans Pattern Anal Mach Intell. 2011 Sep;33(9):1844-59. doi: 10.1109/TPAMI.2010.205. Epub 2010 Dec 10.

Authors

Eng-Jon Ong, Richard Bowden

PMID: 21135441
DOI: 10.1109/TPAMI.2010.205

Abstract

This paper proposes a learned data-driven approach for accurate, real-time tracking of facial features using only intensity information. The task of automatic facial feature tracking is nontrivial since the face is a highly deformable object with large textural variations and motion in certain regions. Existing works attempt to address these problems by either limiting themselves to tracking feature points with strong and unique visual cues (e.g., mouth and eye corners) or by incorporating a priori information that needs to be manually designed (e.g., selecting points for a shape model). The framework proposed here largely avoids the need for such restrictions by automatically identifying the optimal visual support required for tracking a single facial feature point. This automatic identification of the visual context required for tracking allows the proposed method to potentially track any point on the face. Tracking is achieved via linear predictors which provide a fast and effective method for mapping pixel intensities into tracked feature position displacements. Building upon the simplicity and strengths of linear predictors, a more robust biased linear predictor is introduced. Multiple linear predictors are then grouped into a rigid flock to further increase robustness. To improve tracking accuracy, a novel probabilistic selection method is used to identify relevant visual areas for tracking a feature point. These selected flocks are then combined into a hierarchical multiresolution LP model. Finally, we also exploit a simple shape constraint for correcting the occasional tracking failure of a minority of feature points. Experimental results show that this method performs more robustly and accurately than AAMs, with minimal training examples on example sequences that range from SD quality to Youtube quality. Additionally, an analysis of the visual support consistency across different subjects is also provided.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Biometric Identification / methods*
Databases, Factual
Face / anatomy & histology*
Humans
Image Processing, Computer-Assisted / methods*
Models, Statistical