Comparative assessment of parameter estimation methods in the presence of overdispersion: a simulation study

Kimberlyn Roosa; Ruiyan Luo; Gerardo Chowell

doi:10.3934/mbe.2019214

Comparative assessment of parameter estimation methods in the presence of overdispersion: a simulation study

Math Biosci Eng. 2019 May 16;16(5):4299-4313. doi: 10.3934/mbe.2019214.

Authors

Kimberlyn Roosa¹, Ruiyan Luo¹, Gerardo Chowell^{1

2}

Affiliations

¹ Department of Population Health Sciences, School of Public Health, Georgia State University, Atlanta, GA, USA.
² Division of International Epidemiology and Population Studies, Fogarty International Center, National Institute of Health, Bethesda, MD, USA.

PMID: 31499663
DOI: 10.3934/mbe.2019214

Abstract

The Poisson distribution is commonly assumed as the error structure for count data; however, empirical data may exhibit greater variability than expected based on a given statistical model. Greater variability could point to model misspecification, such as missing crucial information about the epidemiology of the disease or changes in population behavior. When the mechanism producing the apparent overdispersion is unknown, it is typically assumed that the variance in the data exceeds the mean (by some scaling factor). Thus, a probability distribution that allows for overdispersion (negative binomial, for example) may better represent the data. Here, we utilize simulation studies to assess how misspecifying the error structure affects parameter estimation results, specifically bias and uncertainty, as a function of the level of random noise in the data. We compare results for two parameter estimation methods: nonlinear least squares and maximum likelihood estimation with Poisson error structure. We analyze two phenomenological models the generalized growth model and generalized logistic growth model to assess how results of parameter estimation are affected by the level of overdispersion underlying in the data. We use simulation to obtain confidence intervals and mean squared error of parameter estimates. We also analyze the impact of the amount of data, or ascending phase length, on the results of the generalized growth model for increasing levels of overdispersion. The results show a clear pattern of increasing uncertainty, or confidence interval width, as the overdispersion in the data increases. While maximum likelihood estimation consistently yields narrower confidence intervals and smaller mean squared error, differences between the two methods were minimal and not practically significant. At moderate levels of overdispersion, both estimation methods yielded similar performance. Importantly, it is shown that issues of parameter uncertainty and bias in the presence of overdispersion can be mitigated with the inclusion of more data.

Keywords: epidemiological models; generalized growth model; overdispersion; parameter estimation; parameter uncertainty; phenomenological models.

Publication types

Comparative Study
Research Support, Non-U.S. Gov't

MeSH terms

Communicable Diseases / epidemiology
Computer Simulation
Epidemics / statistics & numerical data*
Humans
Infection Control / statistics & numerical data
Least-Squares Analysis
Likelihood Functions
Logistic Models
Mathematical Concepts
Models, Biological*
Models, Statistical*
Monte Carlo Method
Nonlinear Dynamics
Poisson Distribution