Methods for Handling Left-Censored Data in Quantitative Microbial Risk Assessment

Appl Environ Microbiol. 2018 Oct 1;84(20):e01203-18. doi: 10.1128/AEM.01203-18. Print 2018 Oct 15.

Abstract

Data below detection limits, left-censored data, are common in environmental microbiology, and decisions in handling censored data may have implications for quantitative microbial risk assessment (QMRA). In this paper, we utilize simulated data sets informed by real-world enterovirus water data to evaluate methods for handling left-censored data. Data sets were simulated with four censoring degrees (low [10%], medium [35%], high [65%], and severe [90%]) and one real-life censoring example (97%) and were informed by enterovirus data assuming a lognormal distribution with a limit of detection (LOD) of 2.3 genome copies/liter. For each data set, five methods for handling left-censored data were applied: (i) substitution with LOD/[Formula: see text], (ii) lognormal maximum likelihood estimation (MLE) to estimate mean and standard deviation, (iii) Kaplan-Meier estimation (KM), (iv) imputation method using MLE to estimate distribution parameters (MI method 1), and (v) imputation from a uniform distribution (MI method 2). Each data set mean was used to estimate enterovirus dose and infection risk. Root mean square error (RMSE) and bias were used to compare estimated and known doses and infection risks. MI method 1 resulted in the lowest dose and infection risk RMSE and bias ranges for most censoring degrees, predicting infection risks at most 1.17 × 10-2 from known values under 97% censoring. MI method 2 was the next overall best method. For medium to severe censoring, MI method 1 may result in the least error. If unsure of the distribution, MI method 2 may be a preferred method to avoid distribution misspecification.IMPORTANCE This study evaluates methods for handling data with low (10%) to severe (90%) left-censoring within an environmental microbiology context and demonstrates that some of these methods may be appropriate when using data containing concentrations below a limit of detection to estimate infection risks. Additionally, this study uses a skewed data set, which is an issue typically faced by environmental microbiologists.

Keywords: left censored; limit of detection; quantitative microbial risk assessment.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Computer Simulation
  • Data Interpretation, Statistical*
  • Drinking Water / virology
  • Enterovirus / genetics
  • Enterovirus / isolation & purification
  • Environmental Microbiology*
  • Genome, Viral
  • Humans
  • Limit of Detection*
  • Models, Statistical
  • Risk Assessment / methods*
  • Water Microbiology

Substances

  • Drinking Water