Cluster analysis for localisation-based data sets: dos and don'ts when quantifying protein aggregates

Front Bioinform. 2023 Nov 24:3:1237551. doi: 10.3389/fbinf.2023.1237551. eCollection 2023.

Abstract

Many proteins display a non-random distribution on the cell surface. From dimers to nanoscale clusters to large, micron-scale aggregations, these distributions regulate protein-protein interactions and signalling. Although these distributions show organisation on length-scales below the resolution limit of conventional optical microscopy, single molecule localisation microscopy (SMLM) can map molecule locations with nanometre precision. The data from SMLM is not a conventional pixelated image and instead takes the form of a point-pattern-a list of the x, y coordinates of the localised molecules. To extract the biological insights that researchers require cluster analysis is often performed on these data sets, quantifying such parameters as the size of clusters, the percentage of monomers and so on. Here, we provide some guidance on how SMLM clustering should best be performed.

Keywords: bioinformactics; cluster analysis; image quantification; protein aggregates; single molecule localisation microscopy (SMLM); spatial point pattern (SPP).

Publication types

  • Review

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. DDLS fellowship (K&A Wallenberg foundation) granted to JG. EPSRC Centre for Doctoral Training in Topological Design granted to LP.