Data-driven methods distort optimal cutoffs and accuracy estimates of depression screening tools: a simulation study using individual participant data

J Clin Epidemiol. 2021 Sep:137:137-147. doi: 10.1016/j.jclinepi.2021.03.031. Epub 2021 Apr 8.

Abstract

Objective: To evaluate, across multiple sample sizes, the degree that data-driven methods result in (1) optimal cutoffs different from population optimal cutoff and (2) bias in accuracy estimates.

Study design and setting: A total of 1,000 samples of sample size 100, 200, 500 and 1,000 each were randomly drawn to simulate studies of different sample sizes from a database (n = 13,255) synthesized to assess Edinburgh Postnatal Depression Scale (EPDS) screening accuracy. Optimal cutoffs were selected by maximizing Youden's J (sensitivity+specificity-1). Optimal cutoffs and accuracy estimates in simulated samples were compared to population values.

Results: Optimal cutoffs in simulated samples ranged from ≥ 5 to ≥ 17 for n = 100, ≥ 6 to ≥ 16 for n = 200, ≥ 6 to ≥ 14 for n = 500, and ≥ 8 to ≥ 13 for n = 1,000. Percentage of simulated samples identifying the population optimal cutoff (≥ 11) was 30% for n = 100, 35% for n = 200, 53% for n = 500, and 71% for n = 1,000. Mean overestimation of sensitivity and underestimation of specificity were 6.5 percentage point (pp) and -1.3 pp for n = 100, 4.2 pp and -1.1 pp for n = 200, 1.8 pp and -1.0 pp for n = 500, and 1.4 pp and -1.0 pp for n = 1,000.

Conclusions: Small accuracy studies may identify inaccurate optimal cutoff and overstate accuracy estimates with data-driven methods.

Keywords: Accuracy estimates; Bias; Cherry-picking; Data-driven methods; Depression; Optimal cutoff.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • Data Science*
  • Depression / diagnosis*
  • Female
  • Humans
  • Psychiatric Status Rating Scales / statistics & numerical data*
  • Reproducibility of Results