Saliency prediction on stereoscopic videos

IEEE Trans Image Process. 2014 Apr;23(4):1476-90. doi: 10.1109/TIP.2014.2303640.

Abstract

We describe a new 3D saliency prediction model that accounts for diverse low-level luminance, chrominance, motion, and depth attributes of 3D videos as well as high-level classifications of scenes by type. The model also accounts for perceptual factors, such as the nonuniform resolution of the human eye, stereoscopic limits imposed by Panum's fusional area, and the predicted degree of (dis) comfort felt, when viewing the 3D video. The high-level analysis involves classification of each 3D video scene by type with regard to estimated camera motion and the motions of objects in the videos. Decisions regarding the relative saliency of objects or regions are supported by data obtained through a series of eye-tracking experiments. The algorithm developed from the model elements operates by finding and segmenting salient 3D space-time regions in a video, then calculating the saliency strength of each segment using measured attributes of motion, disparity, texture, and the predicted degree of visual discomfort experienced. The saliency energy of both segmented objects and frames are weighted using models of human foveation and Panum's fusional area yielding a single predictor of 3D saliency.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Depth Perception
  • Humans
  • Imaging, Three-Dimensional / methods*
  • Models, Theoretical
  • Motion
  • Pattern Recognition, Automated / methods*
  • Video Recording / methods*
  • Young Adult