Scalable Semi-Automatic Annotation for Multi-Camera Person Tracking

Jorge Niño-Castañeda; Andrés Frías-Velázquez; Nyan Bo Bo; Maarten Slembrouck; Junzhi Guan; Glen Debard; Bart Vanrumste; Tinne Tuytelaars; Wilfried Philips

doi:10.1109/TIP.2016.2542021

Scalable Semi-Automatic Annotation for Multi-Camera Person Tracking

IEEE Trans Image Process. 2016 May;25(5):2259-74. doi: 10.1109/TIP.2016.2542021.

Authors

Jorge Niño-Castañeda, Andrés Frías-Velázquez, Nyan Bo Bo, Maarten Slembrouck, Junzhi Guan, Glen Debard, Bart Vanrumste, Tinne Tuytelaars, Wilfried Philips

PMID: 27458637
DOI: 10.1109/TIP.2016.2542021

Abstract

This paper proposes a generic methodology for the semi-automatic generation of reliable position annotations for evaluating multi-camera people-trackers on large video data sets. Most of the annotation data are automatically computed, by estimating a consensus tracking result from multiple existing trackers and people detectors and classifying it as either reliable or not. A small subset of the data, composed of tracks with insufficient reliability, is verified by a human using a simple binary decision task, a process faster than marking the correct person position. The proposed framework is generic and can handle additional trackers. We present results on a data set of $sim 6$ h captured by 4 cameras, featuring a person in a holiday flat, performing activities such as walking, cooking, eating, cleaning, and watching TV. When aiming for a tracking accuracy of 60 cm, 80% of all video frames are automatically annotated. The annotations for the remaining 20% of the frames were added after human verification of an automatically selected subset of data. This involved $sim 2.4$ h of manual labor. According to a subsequent comprehensive visual inspection to judge the annotation procedure, we found 99% of the automatically annotated frames to be correct. We provide guidelines on how to apply the proposed methodology to new data sets. We also provide an exploratory study for the multi-target case, applied on the existing and new benchmark video sequences.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Data Curation / methods*
Human Activities / classification*
Humans
Image Processing, Computer-Assisted / methods*
Video Recording / methods*