mCNN-ETC: identifying electron transporters and their functional families by using multiple windows scanning techniques in convolutional neural networks with evolutionary information of protein sequences

Brief Bioinform. 2022 Jan 17;23(1):bbab352. doi: 10.1093/bib/bbab352.

Abstract

In the past decade, convolutional neural networks (CNNs) have been used as powerful tools by scientists to solve visual data tasks. However, many efforts of convolutional neural networks in solving protein function prediction and extracting useful information from protein sequences have certain limitations. In this research, we propose a new method to improve the weaknesses of the previous method. mCNN-ETC is a deep learning model which can transform the protein evolutionary information into image-like data composed of 20 channels, which correspond to the 20 amino acids in the protein sequence. We constructed CNN layers with different scanning windows in parallel to enhance the useful pattern detection ability of the proposed model. Then we filtered specific patterns through the 1-max pooling layer before inputting them into the prediction layer. This research attempts to solve a basic problem in biology in terms of application: predicting electron transporters and classifying their corresponding complexes. The performance result reached an accuracy of 97.41%, which was nearly 6% higher than its predecessor. We have also published a web server on http://bio219.bioinfo.yzu.edu.tw, which can be used for research purposes free of charge.

Keywords: convolutional neural network; deep learning; electron transport chain; five complexes; motif scanning; position-specific scoring matrix.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Biological Evolution
  • Electrons*
  • Humans
  • Neural Networks, Computer*
  • Proteins / chemistry

Substances

  • Proteins