Information-theoretic model comparison unifies saliency metrics

Proc Natl Acad Sci U S A. 2015 Dec 29;112(52):16054-9. doi: 10.1073/pnas.1510393112. Epub 2015 Dec 10.

Abstract

Learning the properties of an image associated with human gaze placement is important both for understanding how biological systems explore the environment and for computer vision applications. There is a large literature on quantitative eye movement models that seeks to predict fixations from images (sometimes termed "saliency" prediction). A major problem known to the field is that existing model comparison metrics give inconsistent results, causing confusion. We argue that the primary reason for these inconsistencies is because different metrics and models use different definitions of what a "saliency map" entails. For example, some metrics expect a model to account for image-independent central fixation bias whereas others will penalize a model that does. Here we bring saliency evaluation into the domain of information by framing fixation prediction models probabilistically and calculating information gain. We jointly optimize the scale, the center bias, and spatial blurring of all models within this framework. Evaluating existing metrics on these rephrased models produces almost perfect agreement in model rankings across the metrics. Model performance is separated from center bias and spatial blurring, avoiding the confounding of these factors in model comparison. We additionally provide a method to show where and how models fail to capture information in the fixations on the pixel level. These methods are readily extended to spatiotemporal models of fixation scanpaths, and we provide a software package to facilitate their use.

Keywords: eye movements; likelihood; point processes; probabilistic modeling; visual attention.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods
  • Computer Simulation
  • Eye Movements / physiology*
  • Fixation, Ocular / physiology*
  • Humans
  • Models, Biological*
  • Models, Statistical
  • Pattern Recognition, Visual / physiology*
  • Photic Stimulation
  • Reproducibility of Results