Empirical Study of Overfitting in Deep Learning for Predicting Breast Cancer Metastasis

Chuhan Xu; Pablo Coen-Pirani; Xia Jiang

doi:10.3390/cancers15071969

Empirical Study of Overfitting in Deep Learning for Predicting Breast Cancer Metastasis

Cancers (Basel). 2023 Mar 25;15(7):1969. doi: 10.3390/cancers15071969.

Authors

Chuhan Xu¹, Pablo Coen-Pirani¹, Xia Jiang¹

Affiliation

¹ Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15217, USA.

Abstract

Overfitting may affect the accuracy of predicting future data because of weakened generalization. In this research, we used an electronic health records (EHR) dataset concerning breast cancer metastasis to study the overfitting of deep feedforward neural networks (FNNs) prediction models. We studied how each hyperparameter and some of the interesting pairs of hyperparameters were interacting to influence the model performance and overfitting. The 11 hyperparameters we studied were activate function, weight initializer, number of hidden layers, learning rate, momentum, decay, dropout rate, batch size, epochs, L1, and L2. Our results show that most of the single hyperparameters are either negatively or positively corrected with model prediction performance and overfitting. In particular, we found that overfitting overall tends to negatively correlate with learning rate, decay, batch size, and L2, but tends to positively correlate with momentum, epochs, and L1. According to our results, learning rate, decay, and batch size may have a more significant impact on both overfitting and prediction performance than most of the other hyperparameters, including L1, L2, and dropout rate, which were designed for minimizing overfitting. We also find some interesting interacting pairs of hyperparameters such as learning rate and momentum, learning rate and decay, and batch size and epochs.

Keywords: L1; L2; activate function; batch size; breast cancer; deep learning; dropout rate; iteration-based decay; learning rate; metastasis; modeling; momentum beta; network structure; overfitting; training epochs; weight initializers.

Grants and funding

W81XWH1910495/United States Department of Defense