Object class segmentation of RGB-D video using recurrent convolutional neural networks

Neural Netw. 2017 Apr:88:105-113. doi: 10.1016/j.neunet.2017.01.003. Epub 2017 Jan 30.

Abstract

Object class segmentation is a computer vision task which requires labeling each pixel of an image with the class of the object it belongs to. Deep convolutional neural networks (DNN) are able to learn and take advantage of local spatial correlations required for this task. They are, however, restricted by their small, fixed-sized filters, which limits their ability to learn long-range dependencies. Recurrent Neural Networks (RNN), on the other hand, do not suffer from this restriction. Their iterative interpretation allows them to model long-range dependencies by propagating activity. This property is especially useful when labeling video sequences, where both spatial and temporal long-range dependencies occur. In this work, a novel RNN architecture for object class segmentation is presented. We investigate several ways to train such a network. We evaluate our models on the challenging NYU Depth v2 dataset for object class segmentation and obtain competitive results.

Keywords: Computer vision; Object class-segmentation; Recurrent neural networks.

MeSH terms

  • Artificial Intelligence*
  • Humans
  • Neural Networks, Computer*
  • Pattern Recognition, Automated / methods*
  • Video Recording / methods*