BUViTNet: Breast Ultrasound Detection via Vision Transformers

Diagnostics (Basel). 2022 Nov 1;12(11):2654. doi: 10.3390/diagnostics12112654.

Abstract

Convolutional neural networks (CNNs) have enhanced ultrasound image-based early breast cancer detection. Vision transformers (ViTs) have recently surpassed CNNs as the most effective method for natural image analysis. ViTs have proven their capability of incorporating more global information than CNNs at lower layers, and their skip connections are more powerful than those of CNNs, which endows ViTs with superior performance. However, the effectiveness of ViTs in breast ultrasound imaging has not yet been investigated. Here, we present BUViTNet breast ultrasound detection via ViTs, where ViT-based multistage transfer learning is performed using ImageNet and cancer cell image datasets prior to transfer learning for classifying breast ultrasound images. We utilized two publicly available ultrasound breast image datasets, Mendeley and breast ultrasound images (BUSI), to train and evaluate our algorithm. The proposed method achieved the highest area under the receiver operating characteristics curve (AUC) of 1 ± 0, Matthew’s correlation coefficient (MCC) of 1 ± 0, and kappa score of 1 ± 0 on the Mendeley dataset. Furthermore, BUViTNet achieved the highest AUC of 0.968 ± 0.02, MCC of 0.961 ± 0.01, and kappa score of 0.959 ± 0.02 on the BUSI dataset. BUViTNet outperformed ViT trained from scratch, ViT-based conventional transfer learning, and CNN-based transfer learning in classifying breast ultrasound images (p < 0.01 in all cases). Our findings indicate that improved transformers are effective in analyzing breast images and can provide an improved diagnosis if used in clinical settings. Future work will consider the use of a wide range of datasets and parameters for optimized performance.

Keywords: breast cancer; convolutional neural network; transfer learning; ultrasound; vision transformer.