A computational framework for attentional object discovery in RGB-D videos

Cogn Process. 2017 May;18(2):169-182. doi: 10.1007/s10339-017-0791-z. Epub 2017 Feb 2.

Abstract

We present a computational framework for attention-guided visual scene exploration in sequences of RGB-D data. For this, we propose a visual object candidate generation method to produce object hypotheses about the objects in the scene. An attention system is used to prioritise the processing of visual information by (1) localising candidate objects, and (2) integrating an inhibition of return (IOR) mechanism grounded in spatial coordinates. This spatial IOR mechanism naturally copes with camera motions and inhibits objects that have already been the target of attention. Our approach provides object candidates which can be processed by higher cognitive modules such as object recognition. Since objects are basic elements for many higher level tasks, our architecture can be used as a first layer in any cognitive system that aims at interpreting a stream of images. We show in the evaluation how our framework finds most of the objects in challenging real-world scenes.

Keywords: 3D inhibition of return; Computational visual attention; RGB-D object discovery.

MeSH terms

  • Algorithms
  • Attention / physiology*
  • Computer Simulation
  • Female
  • Humans
  • Image Processing, Computer-Assisted / methods*
  • Inhibition, Psychological*
  • Male
  • Pattern Recognition, Visual / physiology*
  • Photic Stimulation
  • Signal Detection, Psychological / physiology*
  • Time Factors
  • Video Recording / methods