Inferring species richness using multispecies occupancy modeling: Estimation performance and interpretation

Ecol Evol. 2019 Feb 5;9(2):780-792. doi: 10.1002/ece3.4821. eCollection 2019 Jan.

Abstract

Multispecies occupancy models can estimate species richness from spatially replicated multispecies detection/non-detection survey data, while accounting for imperfect detection. A model extension using data augmentation allows inferring the total number of species in the community, including those completely missed by sampling (i.e., not detected in any survey, at any site). Here we investigate the robustness of these estimates. We review key model assumptions and test performance via simulations, under a range of scenarios of species characteristics and sampling regimes, exploring sensitivity to the Bayesian priors used for model fitting. We run tests when assumptions are perfectly met and when violated. We apply the model to a real dataset and contrast estimates obtained with and without predictors, and for different subsets of data. We find that, even with model assumptions perfectly met, estimation of the total number of species can be poor in scenarios where many species are missed (>15%-20%) and that commonly used priors can accentuate overestimation. Our tests show that estimation can often be robust to violations of assumptions about the statistical distributions describing variation of occupancy and detectability among species, but lower-tail deviations can result in large biases. We obtain substantially different estimates from alternative analyses of our real dataset, with results suggesting that missing relevant predictors in the model can result in richness underestimation. In summary, estimates of total richness are sensitive to model structure and often uncertain. Appropriate selection of priors, testing of assumptions, and model refinement are all important to enhance estimator performance. Yet, these do not guarantee accurate estimation, particularly when many species remain undetected. While statistical models can provide useful insights, expectations about accuracy in this challenging prediction task should be realistic. Where knowledge about species numbers is considered truly critical for management or policy, survey effort should ideally be such that the chances of missing species altogether are low.

Keywords: Switzerland; data augmentation; detectability; imperfect detection; richness; species occupancy.