Robustifying Bayesian nonparametric mixtures for count data

Biometrics. 2017 Mar;73(1):174-184. doi: 10.1111/biom.12538. Epub 2016 Apr 28.

Abstract

Our motivating application stems from surveys of natural populations and is characterized by large spatial heterogeneity in the counts, which makes parametric approaches to modeling local animal abundance too restrictive. We adopt a Bayesian nonparametric approach based on mixture models and innovate with respect to popular Dirichlet process mixture of Poisson kernels by increasing the model flexibility at the level both of the kernel and the nonparametric mixing measure. This allows to derive accurate and robust estimates of the distribution of local animal abundance and of the corresponding clusters. The application and a simulation study for different scenarios yield also some general methodological implications. Adding flexibility solely at the level of the mixing measure does not improve inferences, since its impact is severely limited by the rigidity of the Poisson kernel with considerable consequences in terms of bias. However, once a kernel more flexible than the Poisson is chosen, inferences can be robustified by choosing a prior more general than the Dirichlet process. Therefore, to improve the performance of Bayesian nonparametric mixtures for count data one has to enrich the model simultaneously at both levels, the kernel and the mixing measure.

Keywords: Abundance heterogeneity; Bayesian Nonparametrics; Mixture model; Pitman-Yor process; Poisson mixture; Rounded Mixture of Gaussians.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Bayes Theorem*
  • Cluster Analysis*
  • Demography
  • Models, Statistical
  • Poisson Distribution
  • Statistics, Nonparametric