SEQUENCE SLIDER: expanding polyalanine fragments for phasing with multiple side-chain hypotheses

Acta Crystallogr D Struct Biol. 2020 Mar 1;76(Pt 3):221-237. doi: 10.1107/S2059798320000339. Epub 2020 Feb 25.

Abstract

Fragment-based molecular-replacement methods can solve a macromolecular structure quasi-ab initio. ARCIMBOLDO, using a common secondary-structure or tertiary-structure template or a library of folds, locates these with Phaser and reveals the rest of the structure by density modification and autotracing in SHELXE. The latter stage is challenging when dealing with diffraction data at lower resolution, low solvent content, high β-sheet composition or situations in which the initial fragments represent a low fraction of the total scattering or where their accuracy is low. SEQUENCE SLIDER aims to overcome these complications by extending the initial polyalanine fragment with side chains in a multisolution framework. Its use is illustrated on test cases and previously unknown structures. The selection and order of fragments to be extended follows the decrease in log-likelihood gain (LLG) calculated with Phaser upon the omission of each single fragment. When the starting substructure is derived from a remote homolog, sequence assignment to fragments is restricted by the original alignment. Otherwise, the secondary-structure prediction is matched to that found in fragments and traces. Sequence hypotheses are trialled in a brute-force approach through side-chain building and refinement. Scoring the refined models through their LLG in Phaser may allow discrimination of the correct sequence or filter the best partial structures for further density modification and autotracing. The default limits for the number of models to pursue are hardware dependent. In its most economic implementation, suitable for a single laptop, the main-chain trace is extended as polyserine rather than trialling models with different sequence assignments, which requires a grid or multicore machine. SEQUENCE SLIDER has been instrumental in solving two novel structures: that of MltC from 2.7 Å resolution data and that of a pneumococcal lipoprotein with 638 residues and 35% solvent content.

Keywords: ARCIMBOLDO; Phaser; SEQUENCE SLIDER; SHELXE; fragment-based molecular replacement; molecular replacement; phasing; side-chain extension.

MeSH terms

  • Algorithms
  • Crystallography, X-Ray / methods*
  • Glycosyltransferases / chemistry
  • Lipoproteins / chemistry
  • Peptide Fragments / chemistry*
  • Peptides / chemistry*
  • Protein Folding
  • Protein Structure, Secondary
  • Software*

Substances

  • Lipoproteins
  • Peptide Fragments
  • Peptides
  • polyalanine
  • Glycosyltransferases