Predicting Hot Spots Using a Deep Neural Network Approach

Methods Mol Biol. 2021:2190:267-288. doi: 10.1007/978-1-0716-0826-5_13.

Abstract

Targeting protein-protein interactions is a challenge and crucial task of the drug discovery process. A good starting point for rational drug design is the identification of hot spots (HS) at protein-protein interfaces, typically conserved residues that contribute most significantly to the binding. In this chapter, we depict point-by-point an in-house pipeline used for HS prediction using only sequence-based features from the well-known SpotOn dataset of soluble proteins (Moreira et al., Sci Rep 7:8007, 2017), through the implementation of a deep neural network. The presented pipeline is divided into three steps: (1) feature extraction, (2) deep learning classification, and (3) model evaluation. We present all the available resources, including code snippets, the main dataset, and the free and open-source modules/packages necessary for full replication of the protocol. The users should be able to develop an HS prediction model with accuracy, precision, recall, and AUROC of 0.96, 0.93, 0.91, and 0.86, respectively.

Keywords: Hot spots; Machine learning; Neural networks; Protein–protein interactions; Python; TensorFlow.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Protein
  • Deep Learning
  • Neural Networks, Computer
  • Protein Binding / physiology
  • Protein Interaction Mapping / methods*
  • Proteins / chemistry*

Substances

  • Proteins