Representing and extracting knowledge from single-cell data

Biophys Rev. 2023 Aug 5;16(1):29-56. doi: 10.1007/s12551-023-01091-4. eCollection 2024 Feb.

Abstract

Single-cell analysis is currently one of the most high-resolution techniques to study biology. The large complex datasets that have been generated have spurred numerous developments in computational biology, in particular the use of advanced statistics and machine learning. This review attempts to explain the deeper theoretical concepts that underpin current state-of-the-art analysis methods. Single-cell analysis is covered from cell, through instruments, to current and upcoming models. The aim of this review is to spread concepts which are not yet in common use, especially from topology and generative processes, and how new statistical models can be developed to capture more of biology. This opens epistemological questions regarding our ontology and models, and some pointers will be given to how natural language processing (NLP) may help overcome our cognitive limitations for understanding single-cell data.

Keywords: Generating processes; Graphs; Markov chains; NLP; Neural networks; Single-cell; Statistics; Topology; VAE.

Publication types

  • Review