A need for speed in Bayesian population models: a practical guide to marginalizing and recovering discrete latent states

Charles B Yackulic; Michael Dodrill; Maria Dzul; Jamie S Sanderlin; Janice A Reid

doi:10.1002/eap.2112

A need for speed in Bayesian population models: a practical guide to marginalizing and recovering discrete latent states

Ecol Appl. 2020 Jul;30(5):e02112. doi: 10.1002/eap.2112. Epub 2020 Apr 1.

Authors

Charles B Yackulic¹, Michael Dodrill¹, Maria Dzul¹, Jamie S Sanderlin², Janice A Reid³

Affiliations

¹ Southwest Biological Science Center, U.S. Geological Survey, 2255 North Gemini Drive, Flagstaff, Arizona, 86001, USA.
² USDA Forest Service, Rocky Mountain Research Station, Flagstaff, Arizona, 86001, USA.
³ USDA Forest Service, Pacific Northwest Research Station, Roseburg Field Station, Roseburg, Oregon, 97331, USA.

PMID: 32112492
DOI: 10.1002/eap.2112

Abstract

Bayesian population models can be exceedingly slow due, in part, to the choice to simulate discrete latent states. Here, we discuss an alternative approach to discrete latent states, marginalization, that forms the basis of maximum likelihood population models and is much faster. Our manuscript has two goals: (1) to introduce readers unfamiliar with marginalization to the concept and provide worked examples and (2) to address topics associated with marginalization that have not been previously synthesized and are relevant to both Bayesian and maximum likelihood models. We begin by explaining marginalization using a Cormack-Jolly-Seber model. Next, we apply marginalization to multistate capture-recapture, community occupancy, and integrated population models and briefly discuss random effects, priors, and pseudo-R² . Then, we focus on recovery of discrete latent states, defining different types of conditional probabilities and showing how quantities such as population abundance or species richness can be estimated in marginalized code. Last, we show that occupancy and site-abundance models with auto-covariates can be fit with marginalized code with minimal impact on parameter estimates. Marginalized code was anywhere from five to >1,000 times faster than discrete code and differences in inferences were minimal. Discrete latent states and fully conditional approaches provide the best estimates of conditional probabilities for a given site or individual. However, estimates for parameters and derived quantities such as species richness and abundance are minimally affected by marginalization. In the case of abundance, marginalized code is both quicker and has lower bias than an N-augmentation approach. Understanding how marginalization works shrinks the divide between Bayesian and maximum likelihood approaches to population models. Some models that have only been presented in a Bayesian framework can easily be fit in maximum likelihood. On the other hand, factors such as informative priors, random effects, or pseudo-R² values may motivate a Bayesian approach in some applications. An understanding of marginalization allows users to minimize the speed that is sacrificed when switching from a maximum likelihood approach. Widespread application of marginalization in Bayesian population models will facilitate more thorough simulation studies, comparisons of alternative model structures, and faster learning.

Keywords: N-occupancy; augmentation; autologistic; closed conditional; density dependence; forward conditional; fully conditional; hidden Markov model; mark-recapture; unconditional.

Published 2020. This article is a U.S. Government work and is in the public domain in the USA.

Publication types

Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Bayes Theorem
Computer Simulation
Likelihood Functions
Models, Statistical*
Population Density
Population Dynamics

Abstract

Publication types

MeSH terms

Grants and funding