Spatio-Temporal Closed-Loop Object Detection

IEEE Trans Image Process. 2017 Mar;26(3):1253-1263. doi: 10.1109/TIP.2017.2651367. Epub 2017 Jan 10.

Abstract

Object detection is one of the most important tasks of computer vision. It is usually performed by evaluating a subset of the possible locations of an image, that are more likely to contain the object of interest. Exhaustive approaches have now been superseded by object proposal methods. The interplay of detectors and proposal algorithms has not been fully analyzed and exploited up to now, although this is a very relevant problem for object detection in video sequences. We propose to connect, in a closed-loop, detectors and object proposal generator functions exploiting the ordered and continuous nature of video sequences. Different from tracking we only require a previous frame to improve both proposal and detection: no prediction based on local motion is performed, thus avoiding tracking errors. We obtain three to four points of improvement in mAP and a detection time that is lower than Faster Regions with CNN features (R-CNN), which is the fastest Convolutional Neural Network (CNN) based generic object detector known at the moment.