GalaxyWater-CNN: Prediction of Water Positions on the Protein Structure by a 3D-Convolutional Neural Network

J Chem Inf Model. 2022 Jul 11;62(13):3157-3168. doi: 10.1021/acs.jcim.2c00306. Epub 2022 Jun 24.

Abstract

Proteins interact with numerous water molecules to perform their physiological functions in biological organisms. Most water molecules act as solvent media; hence, their roles may be considered implicitly in theoretical treatments of protein structure and function. However, some water molecules interact intimately with proteins and require explicit treatment to understand their effects. Most physics-based computational methods are limited in their ability to accurately locate water molecules on protein surfaces because of inaccurate energy functions. Instead of relying on an energy function, this study attempts to learn the locations of water molecules from structural data. GalaxyWater-convolutional neural network (CNN) predicts water positions on protein chains, protein-protein interfaces, and protein-compound binding sites using a 3D-CNN model that is trained to generate a water score map on a given protein structure. The training data are compiled from high-resolution protein crystal structures resolved together with water molecules. GalaxyWater-CNN shows improved water prediction performance both in the coverage of crystal water molecules and in the accuracy of the predicted water positions when compared with previous energy-based methods. This method shows a superior performance in predicting water molecules that form hydrogen-bond networks precisely. The web service and the source code of this water prediction method are freely available at https://galaxy.seoklab.org/gwcnn and https://github.com/seoklab/GalaxyWater-CNN, respectively.

Publication types

  • Review
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Neural Networks, Computer*
  • Protein Binding
  • Proteins / chemistry
  • Software
  • Water*

Substances

  • Proteins
  • Water