Choosing the Most Effective Pattern Classification Model under Learning-Time Constraint

PLoS One. 2015 Jun 26;10(6):e0129947. doi: 10.1371/journal.pone.0129947. eCollection 2015.

Abstract

Nowadays, large datasets are common and demand faster and more effective pattern analysis techniques. However, methodologies to compare classifiers usually do not take into account the learning-time constraints required by applications. This work presents a methodology to compare classifiers with respect to their ability to learn from classification errors on a large learning set, within a given time limit. Faster techniques may acquire more training samples, but only when they are more effective will they achieve higher performance on unseen testing sets. We demonstrate this result using several techniques, multiple datasets, and typical learning-time limits required by applications.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Humans
  • Learning*
  • Models, Theoretical*
  • Pattern Recognition, Automated*

Grants and funding

This work was supported by Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) (http://www.cnpq.br/), grant numbers 303182/2011-3 (JPP), 477692/2012-5 (PJR), 552559/2010-5 (AXF, PTMS), 481556/2009-5 (AXF), 303673/2010-9 (AXF), 303182/2011-3 (JPP), 470571/2013-6 (JPP), 306166/2014-3 (JPP), and 311140/2014-9 (PJR). This work was also supported by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (Capes) (http://www.capes.gov.br/), grant number 01-P-01965/2012 (AXF); Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) (http://www.fapesp.br/), grant numbers 2011/14058-5 (JPP, RYMN), 2012/18768-0 (JPP), 2007/52015-0 (PJR), 2013/20387-7 (JPP), and 2014/16250-9 (JPP); and Fundação de Apoio ao Desenvolvimento do Ensino, Ciência e Tecnologia do Estado de Mato Grosso do Sul (Fundect-MS) (http://fundect.ledes.net/, WPA). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. In addition, the authors would also like to emphasize that Big Data Brazil is the current company of one of the coauthors (RN). However, the results of this present paper refer to the activities that were developed by the author in his master’s degree. This does not alter the authors’ adherence to PLOS ONE policies on sharing data and materials.