Assessing Conformance with Benford's Law: Goodness-Of-Fit Tests and Simultaneous Confidence Intervals

PLoS One. 2016 Mar 28;11(3):e0151235. doi: 10.1371/journal.pone.0151235. eCollection 2016.

Abstract

Benford's Law is a probability distribution for the first significant digits of numbers, for example, the first significant digits of the numbers 871 and 0.22 are 8 and 2 respectively. The law is particularly remarkable because many types of data are considered to be consistent with Benford's Law and scientists and investigators have applied it in diverse areas, for example, diagnostic tests for mathematical models in Biology, Genomics, Neuroscience, image analysis and fraud detection. In this article we present and compare statistically sound methods for assessing conformance of data with Benford's Law, including discrete versions of Cramér-von Mises (CvM) statistical tests and simultaneous confidence intervals. We demonstrate that the common use of many binomial confidence intervals leads to rejection of Benford too often for truly Benford data. Based on our investigation, we recommend that the CvM statistic Ud(2), Pearson's chi-square statistic and 100(1 - α)% Goodman's simultaneous confidence intervals be computed when assessing conformance with Benford's Law. Visual inspection of the data with simultaneous confidence intervals is useful for understanding departures from Benford and the influence of sample size.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Confidence Intervals
  • Databases, Genetic
  • Genomics
  • Models, Theoretical*
  • Open Reading Frames / genetics
  • Probability
  • Sample Size

Grants and funding

C. Tsao was funded by a Natural Sciences and Engineering Research Council of Canada USRA grant, and M. Lesperance was funded by a Natural Sciences and Engineering Research Council of Canada Discovery grant.