Estimation of the time of exposure based on interval and censored data using the ε-accelerated EM algorithm

Stat Med. 2023 Nov 10;42(25):4542-4555. doi: 10.1002/sim.9874. Epub 2023 Aug 22.

Abstract

Accurately estimating the timing of pathogen exposure plays a crucial role in outbreak control for emerging infectious diseases, including the source identification, contact tracing, and vaccine research and development. However, since surveillance activities often collect data retrospectively after symptoms have appeared, obtaining accurate data on the timing of disease onset is difficult in practice and can involve "coarse" observations, such as interval or censored data. To address this challenge, we propose a novel likelihood function, tailored to coarsely observed data in rapid outbreak surveillance, along with an optimization method based on an ε $$ \varepsilon $$ -accelerated EM algorithm for faster convergence to find maximum likelihood estimates (MLEs). The covariance matrix of MLEs is also discussed using a nonparametric bootstrap approach. In terms of bias and mean-squared error, the performance of our proposed method is evaluated through extensive numerical experiments, as well as its application to a series of epidemiological surveillance focused on cases of mass food poisoning. The experiments show that our method exhibits less bias than conventional methods, providing greater efficiency across all scenarios.

Keywords: ε $$ \varepsilon $$ -acceleration; EM algorithm; exposure time; infectious disease; three-parameter lognormal.