Deep Learning Approaches for Detection of Breast Adenocarcinoma Causing Carcinogenic Mutations

Int J Mol Sci. 2022 Sep 29;23(19):11539. doi: 10.3390/ijms231911539.

Abstract

Genes are composed of DNA and each gene has a specific sequence. Recombination or replication within the gene base ends in a permanent change in the nucleotide collection in a DNA called mutation and some mutations can lead to cancer. Breast adenocarcinoma starts in secretary cells. Breast adenocarcinoma is the most common of all cancers that occur in women. According to a survey within the United States of America, there are more than 282,000 breast adenocarcinoma patients registered each 12 months, and most of them are women. Recognition of cancer in its early stages saves many lives. A proposed framework is developed for the early detection of breast adenocarcinoma using an ensemble learning technique with multiple deep learning algorithms, specifically: Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), and Bi-directional LSTM. There are 99 types of driver genes involved in breast adenocarcinoma. This study uses a dataset of 4127 samples including men and women taken from more than 12 cohorts of cancer detection institutes. The dataset encompasses a total of 6170 mutations that occur in 99 genes. On these gene sequences, different algorithms are applied for feature extraction. Three types of testing techniques including independent set testing, self-consistency testing, and a 10-fold cross-validation test is applied to validate and test the learning approaches. Subsequently, multiple deep learning approaches such as LSTM, GRU, and bi-directional LSTM algorithms are applied. Several evaluation metrics are enumerated for the validation of results including accuracy, sensitivity, specificity, Mathew's correlation coefficient, area under the curve, training loss, precision, recall, F1 score, and Cohen's kappa while the values obtained are 99.57, 99.50, 99.63, 0.99, 1.0, 0.2027, 99.57, 99.57, 99.57, and 99.14 respectively.

Keywords: bi-directional LSTM; breast adenocarcinoma; gated recurrent units (GRU); long short-term memory (LSTM) network; mutation detection.

MeSH terms

  • Adenocarcinoma* / diagnosis
  • Adenocarcinoma* / genetics
  • Breast Neoplasms* / diagnosis
  • Breast Neoplasms* / genetics
  • Carcinogens
  • Deep Learning*
  • Female
  • Humans
  • Male
  • Mutation
  • Nucleotides

Substances

  • Carcinogens
  • Nucleotides

Grants and funding

This research received no external funding.