Machine learning-based prediction model for distant metastasis of breast cancer

Comput Biol Med. 2024 Feb:169:107943. doi: 10.1016/j.compbiomed.2024.107943. Epub 2024 Jan 6.

Abstract

Background: Breast cancer is the most prevalent malignancy in women. Advanced breast cancer can develop distant metastases, posing a severe threat to the life of patients. Because the clinical warning signs of distant metastasis are manifested in the late stage of the disease, there is a need for better methods of predicting metastasis.

Methods: First, we screened breast cancer distant metastasis target genes by performing difference analysis and weighted gene co-expression network analysis (WGCNA) on the selected datasets, and performed analyses such as GO enrichment analysis on these target genes. Secondly, we screened breast cancer distant metastasis target genes by LASSO regression analysis and performed correlation analysis and other analyses on these biomarkers. Finally, we constructed several breast cancer distant metastasis prediction models based on Logistic Regression (LR) model, Random Forest (RF) model, Support Vector Machine (SVM) model, Gradient Boosting Decision Tree (GBDT) model and eXtreme Gradient Boosting (XGBoost) model, and selected the optimal model from them.

Results: Several 21-gene breast cancer distant metastasis prediction models were constructed, with the best performance of the model constructed based on the random forest model. This model accurately predicted the emergence of distant metastases from breast cancer, with an accuracy of 93.6 %, an F1-score of 88.9 % and an AUC value of 91.3 % on the validation set.

Conclusion: Our findings have the potential to be translated into a point-of-care prognostic analysis to reduce breast cancer mortality.

Keywords: Biomarkers; Breast cancer distant metastasis; Machine learning; Predictive model; Weighted correlation network analysis.

MeSH terms

  • Breast
  • Breast Neoplasms*
  • Female
  • Gene Expression Profiling
  • Humans
  • Logistic Models
  • Machine Learning