Classifying flow cytometry data using Bayesian analysis helps to distinguish ALS patients from healthy controls

Front Immunol. 2023 Aug 1:14:1198860. doi: 10.3389/fimmu.2023.1198860. eCollection 2023.

Abstract

Introduction: Given its wide availability and cost-effectiveness, multidimensional flow cytometry (mFC) became a core method in the field of immunology allowing for the analysis of a broad range of individual cells providing insights into cell subset composition, cellular behavior, and cell-to-cell interactions. Formerly, the analysis of mFC data solely relied on manual gating strategies. With the advent of novel computational approaches, (semi-)automated gating strategies and analysis tools complemented manual approaches.

Methods: Using Bayesian network analysis, we developed a mathematical model for the dependencies of different obtained mFC markers. The algorithm creates a Bayesian network that is a HC tree when including raw, ungated mFC data of a randomly selected healthy control cohort (HC). The HC tree is used to classify whether the observed marker distribution (either patients with amyotrophic lateral sclerosis (ALS) or HC) is predicted. The relative number of cells where the probability q is equal to zero is calculated reflecting the similarity in the marker distribution between a randomly chosen mFC file (ALS or HC) and the HC tree.

Results: Including peripheral blood mFC data from 68 ALS and 35 HC, the algorithm could correctly identify 64/68 ALS cases. Tuning of parameters revealed that the combination of 7 markers, 200 bins, and 20 patients achieved the highest AUC on a significance level of p < 0.0001. The markers CD4 and CD38 showed the highest zero probability. We successfully validated our approach by including a second, independent ALS and HC cohort (55 ALS and 30 HC). In this case, all ALS were correctly identified and side scatter and CD20 yielded the highest zero probability. Finally, both datasets were analyzed by the commercially available algorithm 'Citrus', which indicated superior ability of Bayesian network analysis when including raw, ungated mFC data.

Discussion: Bayesian network analysis might present a novel approach for classifying mFC data, which does not rely on reduction techniques, thus, allowing to retain information on the entire dataset. Future studies will have to assess the performance when discriminating clinically relevant differential diagnoses to evaluate the complementary diagnostic benefit of Bayesian network analysis to the clinical routine workup.

Keywords: ALS; Bayesian analysis; flow cytometry; immune system; mathematical modeling.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Algorithms
  • Amyotrophic Lateral Sclerosis* / diagnosis
  • Bayes Theorem
  • Female
  • Flow Cytometry* / classification
  • Flow Cytometry* / methods
  • Humans
  • Male
  • Middle Aged
  • Models, Theoretical

Grants and funding

MH thanks the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) for the financial support through 320021702/GRK2326, 333849990/IRTG-2379, B04, B05 and B06 of 442047500/SFB1481, HE5386/18-1,19-2,22-1,23-1,25-1, ERS SFDdM035 and under Germany’s Excellence Strategy EXC-2023 Internet of Production 390621612 and under the Excellence Strategy of the Federal Government and the Länder. Support through the EU DATAHYKING is also acknowledged. SS and ID received funding from the Deutsche Gesellschaft für Muskelkranke (DGM Sc26/1). JI, SM, and MH received funding from the Bundesinstitut für Risikobewertung (60-0102-01.P620).