Machine learning-based cluster analysis of immune cell subtypes and breast cancer survival

Sci Rep. 2023 Nov 3;13(1):18962. doi: 10.1038/s41598-023-45932-4.

Abstract

Host immunity involves various immune cells working in concert to achieve balanced immune response. Host immunity interacts with tumorigenic process impacting disease outcome. Clusters of different immune cells may reveal unique host immunity in relation to breast cancer progression. CIBERSORT algorithm was used to estimate relative abundances of 22 immune cell types in 3 datasets, METABRIC, TCGA, and our study. The cell type data in METABRIC were analyzed for cluster using unsupervised hierarchical clustering (UHC). The UHC results were employed to train machine learning models. Kaplan-Meier and Cox regression survival analyses were performed to assess cell clusters in association with relapse-free and overall survival. Differentially expressed genes by clusters were interrogated with IPA for molecular signatures. UHC analysis identified two distinct immune cell clusters, clusters A (83.2%) and B (16.8%). Memory B cells, plasma cells, CD8 positive T cells, resting memory CD4 T cells, activated NK cells, monocytes, M1 macrophages, and resting mast cells were more abundant in clusters A than B, whereas regulatory T cells and M0 and M2 macrophages were more in clusters B than A. Patients in cluster A had favorable survival. Similar survival associations were also observed in other independent studies. IPA analysis showed that pathogen-induced cytokine storm signaling pathway, phagosome formation, and T cell receptor signaling were related to the cell type clusters. Our finding suggests that different immune cell clusters may indicate distinct immune responses to tumor growth, suggesting their potential for disease management.

MeSH terms

  • Breast Neoplasms* / genetics
  • Cluster Analysis
  • Female
  • Humans
  • Machine Learning
  • Neoplasm Recurrence, Local
  • Survival Analysis