BRAQUE: Bayesian Reduction for Amplified Quantization in UMAP Embedding

Entropy (Basel). 2023 Feb 14;25(2):354. doi: 10.3390/e25020354.

Abstract

Single-cell biology has revolutionized the way we understand biological processes. In this paper, we provide a more tailored approach to clustering and analyzing spatial single-cell data coming from immunofluorescence imaging techniques. We propose Bayesian Reduction for Amplified Quantization in UMAP Embedding (BRAQUE) as an integrative novel approach, from data preprocessing to phenotype classification. BRAQUE starts with an innovative preprocessing, named Lognormal Shrinkage, which is able to enhance input fragmentation by fitting a lognormal mixture model and shrink each component towards its median, in order to help further the clustering step in finding more separated and clear clusters. Then, BRAQUE's pipeline consists of a dimensionality reduction step performed using UMAP, and a clustering performed using HDBSCAN on UMAP embedding. In the end, clusters are assigned to a cell type by experts, using effects size measures to rank markers and identify characterizing markers (Tier 1), and possibly characterize markers (Tier 2). The number of total cell types in one lymph node detectable with these technologies is unknown and difficult to predict or estimate. Therefore, with BRAQUE, we achieved a higher granularity than other similar algorithms such as PhenoGraph, following the idea that merging similar clusters is easier than splitting unclear ones into clear subclusters.

Keywords: Bayesian; Gaussian mixture; cell type; clustering; dimensionality reduction; effect size; lognormal; lymphoid tissue; multiplex immunostaining; single-cell.