scLINE: A multi-network integration framework based on network embedding for representation of single-cell RNA-seq data

J Biomed Inform. 2021 Oct:122:103899. doi: 10.1016/j.jbi.2021.103899. Epub 2021 Sep 3.

Abstract

Single-cell RNA sequencing (scRNA-seq) is fast becoming a powerful technology that revolutionizes biomedical studies related to development, immunology and cancer by providing genome-scale transcriptional profiles at unprecedented throughput and resolution. However, due to the low capture rate and frequent drop-out events in the sequencing process, scRNA-seq data suffer from extremely high sparsity and variability, challenging the data analysis. Here we proposed a novel method called scLINE for learning low dimensional representations of scRNA-seq data. scLINE is based on the network embedding model that jointly considers multiple gene-gene interaction networks, facilitating the incorporation of prior biological knowledge for signal extraction. We comprehensively evaluated scLINE on eight single-cell datasets. Results show that scLINE achieved comparable or higher performance than competing methods, including PCA, t-SNE and Isomap, in terms of internal validation metrics and clustering accuracy. The low dimensional representations learned by scLINE are effective for downstream single-cell analysis, such as visualization, clustering and cell typing. We have implemented scLINE as an easy-to-use R package, which can be incorporated in other existing scRNA-seq analysis pipelines or tools for data preprocessing.

Keywords: Cell type clustering; Low dimensional representation; Multi-network integration; Network embedding; Single-cell RNA sequencing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • Gene Expression Profiling
  • Gene Regulatory Networks*
  • RNA-Seq
  • Sequence Analysis, RNA
  • Single-Cell Analysis*