Cascade Forest-Based Model for Prediction of RNA Velocity

Molecules. 2022 Nov 15;27(22):7873. doi: 10.3390/molecules27227873.

Abstract

In recent years, single-cell RNA sequencing technology (scRNA-seq) has developed rapidly and has been widely used in biological and medical research, such as in expression heterogeneity and transcriptome dynamics of single cells. The investigation of RNA velocity is a new topic in the study of cellular dynamics using single-cell RNA sequencing data. It can recover directional dynamic information from single-cell transcriptomics by linking measurements to the underlying dynamics of gene expression. Predicting the RNA velocity vector of each cell based on its gene expression data and formulating RNA velocity prediction as a classification problem is a new research direction. In this paper, we develop a cascade forest model to predict RNA velocity. Compared with other popular ensemble classifiers, such as XGBoost, RandomForest, LightGBM, NGBoost, and TabNet, it performs better in predicting RNA velocity. This paper provides guidance for researchers in selecting and applying appropriate classification tools in their analytical work and suggests some possible directions for future improvement of classification tools.

Keywords: RNA velocity; cascade forest; ensemble classifier; scRNA-seq.

MeSH terms

  • Biomedical Research*
  • Humans
  • RNA* / genetics
  • Research Personnel
  • Sequence Analysis, RNA
  • Transcriptome

Substances

  • RNA