DISSECTING TUMOR TRANSCRIPTIONAL HETEROGENEITY FROM SINGLE-CELL RNA-SEQ DATA BY GENERALIZED BINARY COVARIANCE DECOMPOSITION

bioRxiv [Preprint]. 2023 Aug 17:2023.08.15.553436. doi: 10.1101/2023.08.15.553436.

Abstract

Profiling tumors with single-cell RNA sequencing (scRNA-seq) has the potential to identify recurrent patterns of transcription variation related to cancer progression, and so produce new therapeutically-relevant insights. However, the presence of strong inter-tumor heterogeneity often obscures more subtle patterns that are shared across tumors, some of which may characterize clinically-relevant disease subtypes. Here we introduce a new statistical method to address this problem. We show that this method can help decompose transcriptional heterogeneity into interpretable components - including patient-specific, dataset-specific and shared components relevant to disease subtypes - and that, in the presence of strong inter-tumor heterogeneity, our method can produce more interpretable results than existing widely-used methods. Applied to data from three studies on pancreatic cancer adenocarcinoma (PDAC), our method produces a refined characterization of existing tumor subtypes (e.g. classical vs basal), and identifies a new gene expression program (GEP) that is prognostic of poor survival independent of established prognostic factors such as tumor stage and subtype. The new GEP is enriched for genes involved in a variety of stress responses, and suggests a potentially important role for the integrated stress response in PDAC development and prognosis.

Publication types

  • Preprint