Machine-Learning Approaches for Predicting the Need of Oxygen Therapy in Early-Stage COVID-19 in Japan: Multicenter Retrospective Observational Study

Syunsuke Yamanaka; Koji Morikawa; Hiroyuki Azuma; Maki Yamanaka; Yoshimitsu Shimada; Toru Wada; Hideyuki Matano; Naoki Yamada; Osamu Yamamura; Hiroyuki Hayashi

doi:10.3389/fmed.2022.846525

Machine-Learning Approaches for Predicting the Need of Oxygen Therapy in Early-Stage COVID-19 in Japan: Multicenter Retrospective Observational Study

Front Med (Lausanne). 2022 Feb 23:9:846525. doi: 10.3389/fmed.2022.846525. eCollection 2022.

Authors

Affiliations

¹ Department of Emergency Medicine and General Internal Medicine, University of Fukui Hospital, Fukui, Japan.
² Connect Inc., Tokyo, Japan.
³ Department of Emergency Medicine, Fukui Prefectural Hospital, Fukui, Japan.
⁴ Department of Emergency Medicine, Tannan Regional Medical Center, Sabae, Japan.
⁵ Department of Emergency Medicine, Japanese Red Cross Fukui Hospital, Fukui, Japan.
⁶ Department of Emergency Medicine, Sugita Genpaku Memorial Obama Municipal Hospital, Obama, Japan.
⁷ Department of Emergency Medicine, Fukui-ken Saiseikai Hospital, Fukui, Japan.
⁸ Department of Community Medicine, Faculty of Medicine, University of Fukui Hospital, Fukui, Japan.

Abstract

Background: Early prediction of oxygen therapy in patients with coronavirus disease 2019 (COVID-19) is vital for triage. Several machine-learning prognostic models for COVID-19 are currently available. However, external validation of these models has rarely been performed. Therefore, most reported predictive performance is optimistic and has a high risk of bias. This study aimed to develop and validate a model that predicts oxygen therapy needs in the early stages of COVID-19 using a sizable multicenter dataset.

Methods: This multicenter retrospective study included consecutive COVID-19 hospitalized patients confirmed by a reverse transcription chain reaction in 11 medical institutions in Fukui, Japan. We developed and validated seven machine-learning models (e.g., penalized logistic regression model) using routinely collected data (e.g., demographics, simple blood test). The primary outcome was the need for oxygen therapy (≥1 L/min or SpO₂ ≤ 94%) during hospitalization. C-statistics, calibration slope, and association measures (e.g., sensitivity) evaluated the performance of the model using the test set (randomly selected 20% of data for internal validation). Among these seven models, the machine-learning model that showed the best performance was re-evaluated using an external dataset. We compared the model performances using the A-DROP criteria (modified version of CURB-65) as a conventional method.

Results: Of the 396 patients with COVID-19 for the model development, 102 patients (26%) required oxygen therapy during hospitalization. For internal validation, machine-learning models, except for the k-point nearest neighbor, had a higher discrimination ability than the A-DORP criteria (P < 0.01). The XGboost had the highest c-statistic in the internal validation (0.92 vs. 0.69 in A-DROP criteria; P < 0.001). For the external validation with 728 temporal independent datasets (106 patients [15%] required oxygen therapy), the XG boost model had a higher c-statistic (0.88 vs. 0.69 in A-DROP criteria; P < 0.001).

Conclusions: Machine-learning models demonstrated a more significant performance in predicting the need for oxygen therapy in the early stages of COVID-19.

Keywords: COVID-19; PROBAST; TRIPOD; machine learning; medical triage; multicenter; prognostic model.