AFExNet: An Adversarial Autoencoder for Differentiating Breast Cancer Sub-Types and Extracting Biologically Relevant Genes

IEEE/ACM Trans Comput Biol Bioinform. 2022 Jul-Aug;19(4):2060-2070. doi: 10.1109/TCBB.2021.3066086. Epub 2022 Aug 8.

Abstract

Technological advancements in high-throughput genomics enable the generation of complex and large data sets that can be used for classification, clustering, and bio-marker identification. Modern deep learning algorithms provide us with the opportunity of finding most significant features in such huge dataset to characterize diseases (e.g., cancer) and their sub-types. Thus, developing such deep learning method, which can successfully extract meaningful features from various breast cancer sub-types, is of current research interest. In this paper, we develop dual stage (unsupervised pre-training and supervised fine-tuning) neural network architecture termed AFExNet based on adversarial auto-encoder (AAE) to extract features from high dimensional genetic data. We evaluated the performance of our model through twelve different supervised classifiers to verify the usefulness of the new features using public RNA-Seq dataset of breast cancer. AFExNet provides consistent results in all performance metrics across twelve different classifiers which makes our model classifier independent. We also develop a method named 'TopGene' to find highly weighted genes from the latent space which could be useful for finding cancer bio-markers. Put together, AFExNet has great potential for biological data to accurately and effectively extract features. Our work is fully reproducible and source code can be downloaded from Github: https://github.com/NeuroSyd/breast-cancer-sub-types.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Breast Neoplasms* / diagnosis
  • Breast Neoplasms* / genetics
  • Cluster Analysis
  • Female
  • Humans
  • Neural Networks, Computer
  • Software