Social Grouping for Multi-Target Tracking and Head Pose Estimation in Video

IEEE Trans Pattern Anal Mach Intell. 2016 Oct;38(10):2082-95. doi: 10.1109/TPAMI.2015.2505292. Epub 2015 Dec 3.

Abstract

Many computer vision tasks are more difficult when tackled without contextual information. For example, in multi-camera tracking, pedestrians may look very different in different cameras with varying pose and lighting conditions. Similarly, head direction estimation in high-angle surveillance video in which human head images are low resolution is challenging. Even humans can have trouble without contextual information. In this work, we couple novel contextual information, social grouping, with two important computer vision tasks: multi-target tracking and head pose/direction estimation in surveillance video. These three components are modeled in a probabilistic formulation and we provide effective solvers.We show that social grouping effectively helps to mitigate visual ambiguities in multi-camera tracking and head pose estimation. We further notice that in single-camera multi-target tracking, social grouping provides a natural high-order association cue that avoids existing complex algorithms for high-order track association. In experiments, we demonstrate improvements with our model over models without social grouping context and several state-of-art approaches on a number of publicly available datasets on tracking, head pose estimation, and group discovery.

MeSH terms

  • Algorithms*
  • Head*
  • Humans
  • Lighting
  • Posture
  • Video Recording