Identification of Transcriptome Biomarkers for Severe COVID-19 with Machine Learning Methods

Biomolecules. 2022 Nov 23;12(12):1735. doi: 10.3390/biom12121735.

Abstract

The rapid spread of COVID-19 has become a major concern for people's lives and health all around the world. COVID-19 patients in various phases and severity require individualized treatment given that different patients may develop different symptoms. We employed machine learning methods to discover biomarkers that may accurately classify COVID-19 in various disease states and severities in this study. The blood gene expression profiles from 50 COVID-19 patients without intensive care, 50 COVID-19 patients with intensive care, 10 non-COVID-19 individuals without intensive care, and 16 non-COVID-19 individuals with intensive care were analyzed. Boruta was first used to remove irrelevant gene features in the expression profiles, and then, the minimum redundancy maximum relevance was applied to sort the remaining features. The generated feature-ranked list was fed into the incremental feature selection method to discover the essential genes and build powerful classifiers. The molecular mechanism of some biomarker genes was addressed using recent studies, and biological functions enriched by essential genes were examined. Our findings imply that genes including UBE2C, PCLAF, CDK1, CCNB1, MND1, APOBEC3G, TRAF3IP3, CD48, and GZMA play key roles in defining the different states and severity of COVID-19. Thus, a new point of reference is provided for understanding the disease's etiology and facilitating a precise therapy.

Keywords: COVID-19; biomarker; enrichment analysis; machine learning; transcriptomic.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers
  • COVID-19* / diagnosis
  • COVID-19* / genetics
  • Humans
  • Machine Learning
  • Transcriptome*

Substances

  • Biomarkers