Gentle Nearest Neighbors Boosting over Proper Scoring Rules

IEEE Trans Pattern Anal Mach Intell. 2015 Jan;37(1):80-93. doi: 10.1109/TPAMI.2014.2307877.

Abstract

Tailoring nearest neighbors algorithms to boosting is an important problem. Recent papers study an approach, UNN, which provably minimizes particular convex surrogates under weak assumptions. However, numerical issues make it necessary to experimentally tweak parts of the UNN algorithm, at the possible expense of the algorithm's convergence and performance. In this paper, we propose a lightweight Newton-Raphson alternative optimizing proper scoring rules from a very broad set, and establish formal convergence rates under the boosting framework that compete with those known for UNN. To the best of our knowledge, no such boosting-compliant convergence rates were previously known in the popular Gentle Adaboost's lineage. We provide experiments on a dozen domains, including Caltech and SUN computer vision databases, comparing our approach to major families including support vector machines, (Ada)boosting and stochastic gradient descent. They support three major conclusions: (i) GNNB significantly outperforms UNN, in terms of convergence rate and quality of the outputs, (ii) GNNB performs on par with or better than computationally intensive large margin approaches, (iii) on large domains that rule out those latter approaches for computational reasons, GNNB provides a simple and competitive contender to stochastic gradient descent. Experiments include a divide-and-conquer improvement of GNNB exploiting the link with proper scoring rules optimization.