SaBrcada: Survival Intervals Prediction for Breast Cancer Patients by Dimension Raising and Age Stratification

Cancers (Basel). 2023 Jul 20;15(14):3690. doi: 10.3390/cancers15143690.

Abstract

(1) Background: Breast cancer is the second leading cause of cancer death among women. The accurate prediction of survival intervals will help physicians make informed decisions about treatment strategies or the use of palliative care. (2) Methods: Gene expression is predictive and correlates to patient prognosis. To establish a reliable prediction tool, we collected a total of 1187 RNA-seq data points from breast cancer patients (median age 58 years) in Fragments Per Kilobase Million (FPKM) format from the TCGA database. Among them, we selected 144 patients with date of death information to establish the SaBrcada-AD dataset. We first normalized the SaBrcada-AD dataset to TPM to build the survival prediction model SaBrcada. After normalization and dimension raising, we used the differential gene expression data to test eight different deep learning architectures. Considering the effect of age on prognosis, we also performed a stratified random sampling test on all ages between the lower and upper quartiles of patient age, 48 and 69 years; (3) Results: Stratifying by age 61, the performance of SaBrcada built by GoogLeNet was improved to a highest accuracy of 0.798. We also built a free website tool to provide five predicted survival periods: within six months, six months to one year, one to three years, three to five years, or over five years, for clinician reference. (4) Conclusions: We built the prediction model, SaBrcada, and the website tool of the same name for breast cancer survival analysis. Through these models and tools, clinicians will be provided with survival interval information as a basis for formulating precision medicine.

Keywords: age stratification; breast cancer; data dimension raising; deep learning; survival analysis.