Cell population identification using fluorescence-minus-one controls with a one-class classifying algorithm

Bioinformatics. 2014 Dec 1;30(23):3372-8. doi: 10.1093/bioinformatics/btu575. Epub 2014 Aug 27.

Abstract

Motivation: The tried and true approach of flow cytometry data analysis is to manually gate on each biomarker separately, which is feasible for a small number of biomarkers, e.g. less than five. However, this rapidly becomes confusing as the number of biomarker increases. Furthermore, multivariate structure is not taken into account. Recently, automated gating algorithms have been implemented, all of which rely on unsupervised learning methodology. However, all unsupervised learning outputs suffer the same difficulties in validation in the absence of external knowledge, regardless of application domain.

Results: We present a new semi-automated algorithm for population discovery that is based on comparison to fluorescence-minus-one controls, thus transferring the problem into that of one-class classification, as opposed to being an unsupervised learning problem. The novel one-class classification algorithm is based on common principal components and can accommodate complex mixtures of multivariate densities. Computational time is short, and the simple nature of the calculations means the algorithm can easily be adapted to process large numbers of cells (10(6)). Furthermore, we are able to find rare cell populations as well as populations with low biomarker concentration, both of which are inherently hard to do in an unsupervised learning context without prior knowledge of the samples' composition.

Availability and implementation: R scripts are available via https://fccf.mpiib-berlin.mpg.de/daten/drfz/bioinformatics/with{username,password}={bioinformatics,Sar=Gac4}.

MeSH terms

  • Algorithms*
  • Biomarkers / analysis
  • Cluster Analysis
  • Flow Cytometry / methods*
  • Fluorescence
  • Humans
  • Support Vector Machine

Substances

  • Biomarkers