Tensor decomposition discriminates tissues using scATAC-seq

Biochim Biophys Acta Gen Subj. 2023 Jun;1867(6):130360. doi: 10.1016/j.bbagen.2023.130360. Epub 2023 Mar 31.

Abstract

ATAC-seq is a powerful tool for measuring the landscape structure of a chromosome. scATAC-seq is a recently updated version of ATAC-seq performed in a single cell. The problem with scATAC-seq is data sparsity and most of the genomic sites are inaccessible. Here, tensor decomposition (TD) was used to fill in missing values. In this study, TD was applied to massive scATAC-seq datasets generated by approximately 200 bp intervals, and this number can reach 13,627,618. Currently, no other methods can deal with large sparse matrices. The proposed method could not only provide UMAP embedding that coincides with tissue specificity, but also select genes associated with various biological enrichment terms and transcription factor targeting. This suggests that TD is a useful tool to process a large sparse matrix generated from scATAC-seq.

Keywords: Large sparse matrices; Single-cell applications; Tensor decomposition; UMAP; scATAC-seq.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromatin*
  • Gene Expression Regulation
  • Genome*
  • Transcription Factors / metabolism

Substances

  • Chromatin
  • Transcription Factors