Small angle X-ray scattering and cross-linking for data assisted protein structure prediction in CASP 12 with prospects for improved accuracy

Proteins. 2018 Mar;86 Suppl 1(Suppl 1):202-214. doi: 10.1002/prot.25452. Epub 2018 Feb 7.

Abstract

Experimental data offers empowering constraints for structure prediction. These constraints can be used to filter equivalently scored models or more powerfully within optimization functions toward prediction. In CASP12, Small Angle X-ray Scattering (SAXS) and Cross-Linking Mass Spectrometry (CLMS) data, measured on an exemplary set of novel fold targets, were provided to the CASP community of protein structure predictors. As solution-based techniques, SAXS and CLMS can efficiently measure states of the full-length sequence in its native solution conformation and assembly. However, this experimental data did not substantially improve prediction accuracy judged by fits to crystallographic models. One issue, beyond intrinsic limitations of the algorithms, was a disconnect between crystal structures and solution-based measurements. Our analyses show that many targets had substantial percentages of disordered regions (up to 40%) or were multimeric or both. Thus, solution measurements of flexibility and assembly support variations that may confound prediction algorithms trained on crystallographic data and expecting globular fully-folded monomeric proteins. Here, we consider the CLMS and SAXS data collected, the information in these solution measurements, and the challenges in incorporating them into computational prediction. As improvement opportunities were only partly realized in CASP12, we provide guidance on how data from the full-length biological unit and the solution state can better aid prediction of the folded monomer or subunit. We furthermore describe strategic integrations of solution measurements with computational prediction programs with the aim of substantially improving foundational knowledge and the accuracy of computational algorithms for biologically-relevant structure predictions for proteins in solution.

Keywords: SAS; SAXS; assembly; combined methods; crystallography; disorder; experimental restraints; flexibility; modeling; prediction accuracy; protein folding; solution scattering; solution structure; unfolded regions; unstructured regions.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Cross-Linking Reagents / chemistry*
  • Humans
  • Mass Spectrometry / methods*
  • Models, Molecular*
  • Protein Conformation*
  • Protein Folding
  • Proteins / chemistry*
  • Scattering, Small Angle*
  • X-Ray Diffraction

Substances

  • Cross-Linking Reagents
  • Proteins