Disease misclassification in electronic healthcare database studies: Deriving validity indices-A contribution from the ADVANCE project

PLoS One. 2020 Apr 22;15(4):e0231333. doi: 10.1371/journal.pone.0231333. eCollection 2020.

Abstract

There is a strong and continuously growing interest in using large electronic healthcare databases to study health outcomes and the effects of pharmaceutical products. However, concerns regarding disease misclassification (i.e. classification errors of the disease status) and its impact on the study results are legitimate. Validation is therefore increasingly recognized as an essential component of database research. In this work, we elucidate the interrelations between the true prevalence of a disease in a database population (i.e. prevalence assuming no disease misclassification), the observed prevalence subject to disease misclassification, and the most common validity indices: sensitivity, specificity, positive and negative predictive value. Based on this, we obtained analytical expressions to derive all the validity indices and true prevalence from the observed prevalence and any combination of two other parameters. The analytical expressions can be used for various purposes. Most notably, they can be used to obtain an estimate of the observed prevalence adjusted for outcome misclassification from any combination of two validity indices and to derive validity indices from each other which would otherwise be difficult to obtain. To allow researchers to easily use the analytical expressions, we additionally developed a user-friendly and freely available web-application.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Databases, Factual*
  • Disease / classification*
  • Electronic Health Records
  • Humans
  • User-Computer Interface

Grants and funding

Finanacial Disclosure: This research was funded by the Innovative Medicines Initiative (IMI) Joint Undertaking through the ADVANCE project [№ 115557]. The IMI is a joint initiative (publicprivate partnership) of the European Commission and the European Federation of Pharmaceutical Industries and Associations (EFPIA) to improve the competitive situation of the European Union in the field of pharmaceutical research. The IMI provided support in the form of salaries for KB, TDS, CD and RG but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. AR and NA did not receive any financial compensation for their contribution to this research. The specific roles of the authors are articulated in the ‘author contributions’ section.