Establishing Classifiers With Clinical Laboratory Indicators to Distinguish COVID-19 From Community-Acquired Pneumonia: Retrospective Cohort Study

Wanfa Dai; Pei-Feng Ke; Zhen-Zhen Li; Qi-Zhen Zhuang; Wei Huang; Yi Wang; Yujuan Xiong; Xian-Zhang Huang

doi:10.2196/23390

Establishing Classifiers With Clinical Laboratory Indicators to Distinguish COVID-19 From Community-Acquired Pneumonia: Retrospective Cohort Study

J Med Internet Res. 2021 Feb 22;23(2):e23390. doi: 10.2196/23390.

Authors

Wanfa Dai^#¹, Pei-Feng Ke^#^{2

3}, Zhen-Zhen Li⁴, Qi-Zhen Zhuang⁴, Wei Huang¹, Yi Wang^{2

4}, Yujuan Xiong^#^{2

3}, Xian-Zhang Huang^#^{2

3}

Affiliations

¹ Department of Respiration, Gong An County People's Hospital, Jingzhou, China.
² Department of Laboratory Medicine, The Second Affiliated Hospital, Guangzhou University of Chinese Medicine, Guangzhou, China.
³ Guangdong Provincial Key Laboratory of Research on Emergency in Traditional Chinese Medicine, Guangzhou, China.
⁴ Second Clinical Medical College, Guangzhou University of Chinese Medicine, Guangzhou, China.

^# Contributed equally.

PMID: 33534722
PMCID: PMC7901596
DOI: 10.2196/23390

Abstract

Background: The initial symptoms of patients with COVID-19 are very much like those of patients with community-acquired pneumonia (CAP); it is difficult to distinguish COVID-19 from CAP with clinical symptoms and imaging examination.

Objective: The objective of our study was to construct an effective model for the early identification of COVID-19 that would also distinguish it from CAP.

Methods: The clinical laboratory indicators (CLIs) of 61 COVID-19 patients and 60 CAP patients were analyzed retrospectively. Random combinations of various CLIs (ie, CLI combinations) were utilized to establish COVID-19 versus CAP classifiers with machine learning algorithms, including random forest classifier (RFC), logistic regression classifier, and gradient boosting classifier (GBC). The performance of the classifiers was assessed by calculating the area under the receiver operating characteristic curve (AUROC) and recall rate in COVID-19 prediction using the test data set.

Results: The classifiers that were constructed with three algorithms from 43 CLI combinations showed high performance (recall rate >0.9 and AUROC >0.85) in COVID-19 prediction for the test data set. Among the high-performance classifiers, several CLIs showed a high usage rate; these included procalcitonin (PCT), mean corpuscular hemoglobin concentration (MCHC), uric acid, albumin, albumin to globulin ratio (AGR), neutrophil count, red blood cell (RBC) count, monocyte count, basophil count, and white blood cell (WBC) count. They also had high feature importance except for basophil count. The feature combination (FC) of PCT, AGR, uric acid, WBC count, neutrophil count, basophil count, RBC count, and MCHC was the representative one among the nine FCs used to construct the classifiers with an AUROC equal to 1.0 when using the RFC or GBC algorithms. Replacing any CLI in these FCs would lead to a significant reduction in the performance of the classifiers that were built with them.

Conclusions: The classifiers constructed with only a few specific CLIs could efficiently distinguish COVID-19 from CAP, which could help clinicians perform early isolation and centralized management of COVID-19 patients.

Keywords: COVID-19; classification algorithm; classifier; clinical laboratory indicators; community-acquired pneumonia.

©Wanfa Dai, Pei-Feng Ke, Zhen-Zhen Li, Qi-Zhen Zhuang, Wei Huang, Yi Wang, Yujuan Xiong, Xian-Zhang Huang. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 22.02.2021.

MeSH terms

Area Under Curve
COVID-19 / blood
COVID-19 / diagnosis*
COVID-19 / virology
Community-Acquired Infections / blood
Community-Acquired Infections / diagnosis*
Female
Humans
Laboratories
Leukocyte Count
Logistic Models
Machine Learning*
Male
Middle Aged
Pneumonia / blood
Pneumonia / diagnosis*
Procalcitonin / blood
ROC Curve
Retrospective Studies
SARS-CoV-2 / pathogenicity*

Substances

Procalcitonin