Regionlets for Generic Object Detection

IEEE Trans Pattern Anal Mach Intell. 2015 Oct;37(10):2071-84. doi: 10.1109/TPAMI.2015.2389830.

Abstract

Generic object detection is confronted by dealing with different degrees of variations, caused by viewpoints or deformations in distinct object classes, with tractable computations. This demands for descriptive and flexible object representations which can be efficiently evaluated in many locations. We propose to model an object class with a cascaded boosting classifier which integrates various types of features from competing local regions, each of which may consist of a group of subregions, named as regionlets. A regionlet is a base feature extraction region defined proportionally to a detection window at an arbitrary resolution (i.e., size and aspect ratio). These regionlets are organized in small groups with stable relative positions to be descriptive to delineate fine-grained spatial layouts inside objects. Their features are aggregated into a one-dimensional feature within one group so as to be flexible to tolerate deformations. The most discriminative regionlets for each object class are selected through a boosting learning procedure. Our regionlet approach achieves very competitive performance on popular multi-class detection benchmark datasets with a single method, without any context. It achieves a detection mean average precision of 41.7 percent on the PASCAL VOC 2007 dataset, and 39.7 percent on the VOC 2010 for 20 object categories. We further develop support pixel integral images to efficiently augment regionlet features with the responses learned by deep convolutional neural networks. Our regionlet based method won second place in the ImageNet Large Scale Visual Object Recognition Challenge (ILSVRC 2013).