Latent COVID-19 Clusters in Patients with Chronic Respiratory Conditions

Stud Health Technol Inform. 2020 Nov 23:275:32-36. doi: 10.3233/SHTI200689.

Abstract

The goal of this paper was to apply unsupervised machine learning techniques towards the discovery of latent COVID-19 clusters in patients with chronic lower respiratory diseases (CLRD). Patients who underwent testing for SARS-CoV-2 were identified from electronic medical records. The analytical dataset comprised 2,328 CLRD patients of whom 1,029 were tested COVID-19 positive. We used the factor analysis for mixed data method for preprocessing. It performed principle component analysis on numeric values and multiple correspondence analysis on categorical values which helped convert categorical data into numeric. Cluster analysis was an effective means to both distinguish subgroups of CLRD patients with COVID-19 as well as identify patient clusters which were adversely affected by the infection. Age, comorbidity index and race were important factors for cluster separations. Furthermore, diseases of the circulatory system, the nervous system and sense organs, digestive system, genitourinary system, metabolic diseases and immunity disorders were also important criteria in the resulting cluster analyses.

Keywords: COVID-19; Chronic lower respiratory diseases; cluster analysis.

MeSH terms

  • Betacoronavirus*
  • COVID-19
  • Coronavirus Infections* / epidemiology
  • Electronic Health Records*
  • Humans
  • Pandemics*
  • Pneumonia, Viral* / epidemiology
  • SARS-CoV-2
  • Unsupervised Machine Learning*