A machine learning approach identifies distinct early-symptom cluster phenotypes which correlate with hospitalization, failure to return to activities, and prolonged COVID-19 symptoms

PLoS One. 2023 Feb 9;18(2):e0281272. doi: 10.1371/journal.pone.0281272. eCollection 2023.

Abstract

Background: Accurate COVID-19 prognosis is a critical aspect of acute and long-term clinical management. We identified discrete clusters of early stage-symptoms which may delineate groups with distinct disease severity phenotypes, including risk of developing long-term symptoms and associated inflammatory profiles.

Methods: 1,273 SARS-CoV-2 positive U.S. Military Health System beneficiaries with quantitative symptom scores (FLU-PRO Plus) were included in this analysis. We employed machine-learning approaches to identify symptom clusters and compared risk of hospitalization, long-term symptoms, as well as peak CRP and IL-6 concentrations.

Results: We identified three distinct clusters of participants based on their FLU-PRO Plus symptoms: cluster 1 ("Nasal cluster") is highly correlated with reporting runny/stuffy nose and sneezing, cluster 2 ("Sensory cluster") is highly correlated with loss of smell or taste, and cluster 3 ("Respiratory/Systemic cluster") is highly correlated with the respiratory (cough, trouble breathing, among others) and systemic (body aches, chills, among others) domain symptoms. Participants in the Respiratory/Systemic cluster were twice as likely as those in the Nasal cluster to have been hospitalized, and 1.5 times as likely to report that they had not returned-to-activities, which remained significant after controlling for confounding covariates (P < 0.01). Respiratory/Systemic and Sensory clusters were more likely to have symptoms at six-months post-symptom-onset (P = 0.03). We observed higher peak CRP and IL-6 in the Respiratory/Systemic cluster (P < 0.01).

Conclusions: We identified early symptom profiles potentially associated with hospitalization, return-to-activities, long-term symptoms, and inflammatory profiles. These findings may assist in patient prognosis, including prediction of long COVID risk.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, N.I.H., Extramural

MeSH terms

  • COVID-19* / epidemiology
  • Hospitalization
  • Humans
  • Interleukin-6
  • Machine Learning
  • Phenotype
  • Post-Acute COVID-19 Syndrome
  • SARS-CoV-2

Substances

  • Interleukin-6

Grants and funding

This work was supported by awards from the Defense Health Program (HU00012020067) and the National Institute of Allergy and Infectious Disease (HU00011920111). The protocol was executed by the Infectious Disease Clinical Research Program (IDCRP), a Department of Defense (DoD) program executed by the Uniformed Services University of the Health Sciences (USUHS) through a cooperative agreement by the Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc. (HJF). This project has been funded in part by the National Institute of Allergy and Infectious Diseases at the National Institutes of Health, under an interagency agreement (Y1-AI-5072).