Active learning for solving the incomplete data problem in facial age classification by the furthest nearest-neighbor criterion

IEEE Trans Image Process. 2011 Jul;20(7):2049-62. doi: 10.1109/TIP.2011.2106794. Epub 2011 Jan 17.

Abstract

Facial age classification is an approach to classify face images into one of several predefined age groups. One of the difficulties in applying learning techniques to the age classification problem is the large amount of labeled training data required. Acquiring such training data is very costly in terms of age progress, privacy, human time, and effort. Although unlabeled face images can be obtained easily, it would be expensive to manually label them on a large scale and getting the ground truth. The frugal selection of the unlabeled data for labeling to quickly reach high classification performance with minimal labeling efforts is a challenging problem. In this paper, we present an active learning approach based on an online incremental bilateral two-dimension linear discriminant analysis (IB2DLDA) which initially learns from a small pool of labeled data and then iteratively selects the most informative samples from the unlabeled set to increasingly improve the classifier. Specifically, we propose a novel data selection criterion called the furthest nearest-neighbor (FNN) that generalizes the margin-based uncertainty to the multiclass case and which is easy to compute, so that the proposed active learning algorithm can handle a large number of classes and large data sizes efficiently. Empirical experiments on FG-NET and Morph databases together with a large unlabeled data set for age categorization problems show that the proposed approach can achieve results comparable or even outperform a conventionally trained active classifier that requires much more labeling effort. Our IB2DLDA-FNN algorithm can achieve similar results much faster than random selection and with fewer samples for age categorization. It also can achieve comparable results with active SVM but is much faster than active SVM in terms of training because kernel methods are not needed. The results on the face recognition database and palmprint/palm vein database showed that our approach can handle problems with large number of classes. Our contributions in this paper are twofold. First, we proposed the IB2DLDA-FNN, the FNN being our novel idea, as a generic on-line or active learning paradigm. Second, we showed that it can be another viable tool for active learning of facial age range classification.

MeSH terms

  • Adolescent
  • Adult
  • Age Factors
  • Algorithms
  • Artificial Intelligence*
  • Biometric Identification / methods*
  • Child
  • Child, Preschool
  • Cluster Analysis
  • Databases, Factual
  • Discriminant Analysis
  • Face / anatomy & histology*
  • Humans
  • Infant
  • Middle Aged