Electronic medical record-based deep data cleaning and phenotyping improve the diagnostic validity and mortality assessment of infective endocarditis: medical big data initiative of CMUH

Biomedicine (Taipei). 2021 Sep 1;11(3):59-67. doi: 10.37796/2211-8039.1267. eCollection 2021.

Abstract

Background: International Classification of Diseases (ICD) code-based claims databases are often used to study infective endocarditis (IE). However, the quality of ICD coding can influence the reliability of IE research. The impact of complementing the ICD-only approach with data extracted from electronic medical records (EMRs) has yet to be explored.

Methods: We selected the information of adult patients with discharge ICD codes for IE (ICD-9: 421, 112.81, 036.42, 098.84, 115.04, 115.14, 115.94, 424.9; ICD-10: I33, I38, I39) during 2005-2016 in China Medical University Hospital. Data extraction was conducted on the basis of the modified Duke criteria to establish a reference group comprising patients with definite or possible IE. Clinical characteristics and in-hospital mortality were compared between ICD-identified and Duke-confirmed cases. The positive predictive value (PPV) was used to quantify the IE identification performance of various phenotyping algorithms.

Results: A total of 593 patients with discharge ICD codes for IE were identified, only 56.7% met the modified Duke criteria. The crude in-hospital mortality for Duke-confirmed and Duke-rejected IE were 24.4% and 8.2%, respectively. The adjusted in-hospital mortality for ICD-identified IE was lower than that for Duke-confirmed IE by a difference of 5.1%. The best PPV was achieved (0.90, 95% CI 0.86-0.93) when major components of the Duke criteria (positive blood culture and vegetation) were integrated with ICD codes.

Conclusion: Integrating EMR data can considerably improve the accuracy of ICD-only approaches in phenotyping IE, which can improve the validity of EMR-based studies and their applications, including real-time surveillance and clinical decision support.

Keywords: Disease phenotyping; Electronic medical record; Infective endocarditis; International Classification of Diseases; Positive predictive value.