bSiteFinder, an improved protein-binding sites prediction server based on structural alignment: more accurate and less time-consuming

J Cheminform. 2016 Jul 11:8:38. doi: 10.1186/s13321-016-0149-z. eCollection 2016.

Abstract

Motivation: Protein-binding sites prediction lays a foundation for functional annotation of protein and structure-based drug design. As the number of available protein structures increases, structural alignment based algorithm becomes the dominant approach for protein-binding sites prediction. However, the present algorithms underutilize the ever increasing numbers of three-dimensional protein-ligand complex structures (bound protein), and it could be improved on the process of alignment, selection of templates and clustering of template. Herein, we built so far the largest database of bound templates with stringent quality control. And on this basis, bSiteFinder as a protein-binding sites prediction server was developed.

Results: By introducing Homology Indexing, Chain Length Indexing, Stability of Complex and Optimized Multiple-Templates Clustering into our algorithm, the efficiency of our server has been significantly improved. Further, the accuracy was approximately 2-10 % higher than that of other algorithms for the test with either bound dataset or unbound dataset. For 210 bound dataset, bSiteFinder achieved high accuracies up to 94.8 % (MCC 0.95). For another 48 bound/unbound dataset, bSiteFinder achieved high accuracies up to 93.8 % for bound proteins (MCC 0.95) and 85.4 % for unbound proteins (MCC 0.72). Our bSiteFinder server is freely available at http://binfo.shmtu.edu.cn/bsitefinder/, and the source code is provided at the methods page.

Conclusion: An online bSiteFinder server is freely available at http://binfo.shmtu.edu.cn/bsitefinder/. Our work lays a foundation for functional annotation of protein and structure-based drug design. With ever increasing numbers of three-dimensional protein-ligand complex structures, our server should be more accurate and less time-consuming.Graphical Abstract bSiteFinder (http://binfo.shmtu.edu.cn/bsitefinder/) as a protein-binding sites prediction server was developed based on the largest database of bound templates so far with stringent quality control. By introducing Homology Indexing, Chain Length Indexing, Stability of Complex and Optimized Multiple-Templates Clustering into our algorithm, the efficiency of our server have been significantly improved. What's more, the accuracy was approximately 2-10 % higher than that of other algorithms for the test with either bound dataset or unbound dataset.

Keywords: Index; Multiple-Templates Clustering; Protein-binding sites prediction; Structural alignment; Web server.