Sample Selection for Training Cascade Detectors

PLoS One. 2015 Jul 21;10(7):e0133059. doi: 10.1371/journal.pone.0133059. eCollection 2015.

Abstract

Automatic detection systems usually require large and representative training datasets in order to obtain good detection and false positive rates. Training datasets are such that the positive set has few samples and/or the negative set should represent anything except the object of interest. In this respect, the negative set typically contains orders of magnitude more images than the positive set. However, imbalanced training databases lead to biased classifiers. In this paper, we focus our attention on a negative sample selection method to properly balance the training data for cascade detectors. The method is based on the selection of the most informative false positive samples generated in one stage to feed the next stage. The results show that the proposed cascade detector with sample selection obtains on average better partial AUC and smaller standard deviation than the other compared cascade detectors.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Area Under Curve
  • Artificial Intelligence*
  • Breast Neoplasms / diagnosis
  • Computational Biology / methods*
  • Databases, Factual
  • Facial Recognition
  • False Positive Reactions
  • Female
  • Humans
  • Mammography / methods
  • Pattern Recognition, Automated / methods*
  • Pedestrians
  • ROC Curve
  • Radiographic Image Interpretation, Computer-Assisted

Grants and funding

This work was supported by Spanish Ministry for Economy and Competitiveness / European Regional Development Fund through project TIN2011-24367 and European Union’s Horizon 2020 Research and Innovation Programme under grant agreement No. 643924. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.