JSOM: Jointly-evolving self-organizing maps for alignment of biological datasets and identification of related clusters

PLoS Comput Biol. 2021 Mar 16;17(3):e1008804. doi: 10.1371/journal.pcbi.1008804. eCollection 2021 Mar.

Abstract

With the rapid advances of various single-cell technologies, an increasing number of single-cell datasets are being generated, and the computational tools for aligning the datasets which make subsequent integration or meta-analysis possible have become critical. Typically, single-cell datasets from different technologies cannot be directly combined or concatenated, due to the innate difference in the data, such as the number of measured parameters and the distributions. Even datasets generated by the same technology are often affected by the batch effect. A computational approach for aligning different datasets and hence identifying related clusters will be useful for data integration and interpretation in large scale single-cell experiments. Our proposed algorithm called JSOM, a variation of the Self-organizing map, aligns two related datasets that contain similar clusters, by constructing two maps-low-dimensional discretized representation of datasets-that jointly evolve according to both datasets. Here we applied the JSOM algorithm to flow cytometry, mass cytometry, and single-cell RNA sequencing datasets. The resulting JSOM maps not only align the related clusters in the two datasets but also preserve the topology of the datasets so that the maps could be used for further analysis, such as clustering.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Animals
  • Cluster Analysis
  • Computational Biology / methods*
  • Databases, Factual
  • Flow Cytometry
  • Humans
  • Mice
  • Sequence Analysis, RNA
  • Unsupervised Machine Learning*

Grants and funding

This work has been supported by The Leona M. and Harry B. Helmsley Charitable Trust (G-2007-04028 to P.Q.) and the National Science Foundation (CCF1552784 and CCF2007029 to P.Q.). P.Q. is an ISAC Marylou Ingram Scholar and a Carol Ann and David D. Flanagan Faculty Fellow. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.