Submodular Attribute Selection for Visual Recognition

IEEE Trans Pattern Anal Mach Intell. 2017 Nov;39(11):2242-2255. doi: 10.1109/TPAMI.2016.2636827. Epub 2016 Dec 7.

Abstract

In real-world visual recognition problems, low-level features cannot adequately characterize the semantic content in images, or the spatio-temporal structure in videos. In this work, we encode objects or actions based on attributes that describe them as high-level concepts. We consider two types of attributes. One type of attributes is generated by humans, while the second type is data-driven attributes extracted from data using dictionary learning methods. Attribute-based representation may exhibit variations due to noisy and redundant attributes. We propose a discriminative and compact attribute-based representation by selecting a subset of discriminative attributes from a large attribute set. Three attribute selection criteria are proposed and formulated as a submodular optimization problem. A greedy optimization algorithm is presented and its solution is guaranteed to be at least (1-1/e)-approximation to the optimum. Experimental results on four public datasets demonstrate that the proposed attribute-based representation significantly boosts the performance of visual recognition and outperforms most recently proposed recognition approaches.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Animals
  • Databases, Factual
  • Human Activities / classification
  • Humans
  • Image Processing, Computer-Assisted / methods*
  • Pattern Recognition, Automated / methods*
  • Sports / classification
  • Video Recording