On point estimation of the abnormality of a Mahalanobis index

Comput Stat Data Anal. 2016 Jul:99:115-130. doi: 10.1016/j.csda.2016.01.014.

Abstract

Mahalanobis distance may be used as a measure of the disparity between an individual's profile of scores and the average profile of a population of controls. The degree to which the individual's profile is unusual can then be equated to the proportion of the population who would have a larger Mahalanobis distance than the individual. Several estimators of this proportion are examined. These include plug-in maximum likelihood estimators, medians, the posterior mean from a Bayesian probability matching prior, an estimator derived from a Taylor expansion, and two forms of polynomial approximation, one based on Bernstein polynomial and one on a quadrature method. Simulations show that some estimators, including the commonly-used plug-in maximum likelihood estimators, can have substantial bias for small or moderate sample sizes. The polynomial approximations yield estimators that have low bias, with the quadrature method marginally to be preferred over Bernstein polynomials. However, the polynomial estimators sometimes yield infeasible estimates that are outside the 0-1 range. While none of the estimators are perfectly unbiased, the median estimators match their definition; in simulations their estimates of the proportion have a median error close to zero. The standard median estimator can give unrealistically small estimates (including 0) and an adjustment is proposed that ensures estimates are always credible. This latter estimator has much to recommend it when unbiasedness is not of paramount importance, while the quadrature method is recommended when bias is the dominant issue.

Keywords: Bernstein polynomials; Mahalanobis distance; Median estimator; Plug-in maximum likelihood; Quadrature approximation; Unbiased estimation.