Bias-corrected maximum-likelihood estimation of multiplicity of infection and lineage frequencies

PLoS One. 2021 Dec 29;16(12):e0261889. doi: 10.1371/journal.pone.0261889. eCollection 2021.

Abstract

Background: The UN's Sustainable Development Goals are devoted to eradicate a range of infectious diseases to achieve global well-being. These efforts require monitoring disease transmission at a level that differentiates between pathogen variants at the genetic/molecular level. In fact, the advantages of genetic (molecular) measures like multiplicity of infection (MOI) over traditional metrics, e.g., R0, are being increasingly recognized. MOI refers to the presence of multiple pathogen variants within an infection due to multiple infective contacts. Maximum-likelihood (ML) methods have been proposed to derive MOI and pathogen-lineage frequencies from molecular data. However, these methods are biased.

Methods and findings: Based on a single molecular marker, we derive a bias-corrected ML estimator for MOI and pathogen-lineage frequencies. We further improve these estimators by heuristical adjustments that compensate shortcomings in the derivation of the bias correction, which implicitly assumes that data lies in the interior of the observational space. The finite sample properties of the different variants of the bias-corrected estimators are investigated by a systematic simulation study. In particular, we investigate the performance of the estimator in terms of bias, variance, and robustness against model violations. The corrections successfully remove bias except for extreme parameters that likely yield uninformative data, which cannot sustain accurate parameter estimation. Heuristic adjustments further improve the bias correction, particularly for small sample sizes. The bias corrections also reduce the estimators' variances, which coincide with the Cramér-Rao lower bound. The estimators are reasonably robust against model violations.

Conclusions: Applying bias corrections can substantially improve the quality of MOI estimates, particularly in areas of low as well as areas of high transmission-in both cases estimates tend to be biased. The bias-corrected estimators are (almost) unbiased and their variance coincides with the Cramér-Rao lower bound, suggesting that no further improvements are possible unless additional information is provided. Additional information can be obtained by combining data from several molecular markers, or by including information that allows stratifying the data into heterogeneous groups.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bias
  • Computer Simulation*
  • Data Interpretation, Statistical*
  • Humans
  • Infections / epidemiology*
  • Likelihood Functions
  • Models, Statistical*

Grants and funding

K.A.S. is funded by the German Academic Exchange (Project-ID 57417782), the SMWK-SAB project “Innovationsvorhaben zur Profilschärfung an Hochschulen für angewandte Wissenschaften” (Project number 100257255), the Federal Ministry of Education and Research (BMBF) and the DLR (Project number 01DQ20002), the ESF Young Investigator Group “Agile Publika” funded by ESF, SMWK, SAB (SAB Project 100310497), the DFG Projektakademie “Ökologisch nachhaltige Wertschöpfungsketten in der Landwirtschaft zur Optimierung des Insektizid-Gebrauchs aufgrund von automatisiertem Schädlings-Monitoring” (DFG project 656983) and PhD scholarship from Hanns Seidel Stiftung. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.