Predicting FAD Interacting Residues with Feature Selection and Comprehensive Sequence Descriptors

IEEE/ACM Trans Comput Biol Bioinform. 2019 Nov-Dec;16(6):2046-2056. doi: 10.1109/TCBB.2018.2824332. Epub 2018 Apr 9.

Abstract

The function of a flavoprotein is determined to a great extent by the binding sites on its surface that interacts with flavin adenine dinucleotide (FAD). Malfunction or dysregulation of FAD binding leads to a series of diseases. Therefore, accurately identifying FAD interacting residues (FIRs) provides insights into the molecular mechanisms of flavoprotein-related biological processes and disease progression. In this paper, a new computational method is proposed for identifying FIRs from protein sequences. Various sequence-derived discriminative features are explored. We analyze the distinctions of these features between FIRs and non-FIRs. We also investigate the predictive capabilities of both individual features and combinations of features. A relief algorithm followed by incremental feature selection (relief-IFS) is then adopted to search the optimal features. Finally, a random forest (RF) module is used to predict FIRs based on the optimal features. Using a 5-fold cross-validation test, the proposed method performs well, with a sensitivity of 0.847, a specificity of 0.933, an accuracy of 0.890, and a Matthews correlation coefficient (MCC) of 0.782, thereby outperforming previous methods. These results indicate that our method is relatively successful at predicting FIRs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acids / chemistry
  • Bayes Theorem
  • Binding Sites*
  • Computational Biology / methods*
  • Computer Simulation
  • Databases, Protein
  • Flavin Mononucleotide / chemistry
  • Flavin-Adenine Dinucleotide / chemistry*
  • Humans
  • Ligands
  • Protein Binding
  • Proteins / chemistry
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Solvents / chemistry

Substances

  • Amino Acids
  • Ligands
  • Proteins
  • Solvents
  • Flavin-Adenine Dinucleotide
  • Flavin Mononucleotide