Assessment of contact predictions in CASP12: Co-evolution and deep learning coming of age

Joerg Schaarschmidt; Bohdan Monastyrskyy; Andriy Kryshtafovych; Alexandre M J J Bonvin

doi:10.1002/prot.25407

Assessment of contact predictions in CASP12: Co-evolution and deep learning coming of age

Proteins. 2018 Mar;86 Suppl 1(Suppl Suppl 1):51-66. doi: 10.1002/prot.25407. Epub 2017 Nov 7.

Authors

Joerg Schaarschmidt¹, Bohdan Monastyrskyy², Andriy Kryshtafovych², Alexandre M J J Bonvin¹

Affiliations

¹ Faculty of Science - Chemistry, Computational Structural Biology Group, Bijvoet Center for Biomolecular Research, Utrecht University, Utrecht, The Netherlands.
² Genome Center, University of California, Davis, California.

Abstract

Following up on the encouraging results of residue-residue contact prediction in the CASP11 experiment, we present the analysis of predictions submitted for CASP12. The submissions include predictions of 34 groups for 38 domains classified as free modeling targets which are not accessible to homology-based modeling due to a lack of structural templates. CASP11 saw a rise of coevolution-based methods outperforming other approaches. The improvement of these methods coupled to machine learning and sequence database growth are most likely the main driver for a significant improvement in average precision from 27% in CASP11 to 47% in CASP12. In more than half of the targets, especially those with many homologous sequences accessible, precisions above 90% were achieved with the best predictors reaching a precision of 100% in some cases. We furthermore tested the impact of using these contacts as restraints in ab initio modeling of 14 single-domain free modeling targets using Rosetta. Adding contacts to the Rosetta calculations resulted in improvements of up to 26% in GDT_TS within the top five structures.

Keywords: CASP; co-variation; contact prediction; correlated mutations; de novo structure prediction; evolutionary coupling.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Computational Biology / methods*
Crystallography, X-Ray
Databases, Protein
Humans
Machine Learning
Models, Molecular*
Protein Conformation*
Protein Folding
Protein Interaction Domains and Motifs*
Proteins / chemistry*
Software

Substances

Proteins

Grants and funding

R01 GM100482/GM/NIGMS NIH HHS/United States