MD-ALL: an Integrative Platform for Molecular Diagnosis of B-cell Acute Lymphoblastic Leukemia

Res Sq [Preprint]. 2023 Apr 14:rs.3.rs-2798895. doi: 10.21203/rs.3.rs-2798895/v1.

Abstract

B-cell acute lymphoblastic leukemia (B-ALL) consists of dozens of subtypes defined by distinct gene expression profiles (GEPs) and various genetic lesions. With the application of transcriptome sequencing (RNA-seq), multiple novel subtypes have been identified, which lead to an advanced B-ALL classification and risk-stratification system. However, the complexity of analyzing RNA-seq data for B-ALL classification hinders the implementation of the new B-ALL taxonomy. Here, we introduce MD-ALL (Molecular Diagnosis of ALL), a user-friendly platform featuring sensitive and accurate B-ALL classification based on GEPs and sentinel genetic alterations. In this study, we systematically analyzed 2,955 B-ALL RNA-seq samples and generated a reference dataset representing all the reported B-ALL subtypes. Using multiple machine learning algorithms, we identified the feature genes and then established highly accurate models for B-ALL classification using either bulk or single-cell RNA-seq data. Importantly, this platform integrates the key genetic lesions, including sequence mutations, large-scale copy number variations, and gene rearrangements, to perform comprehensive and definitive B-ALL classification. Through validation in a hold-out cohort of 974 samples, our models demonstrated superior performance for B-ALL classification compared with alternative tools. In summary, MD-ALL is a user-friendly B-ALL classification platform designed to enable integrative, accurate, and comprehensive B-ALL subtype classification.

Publication types

  • Preprint