Fast design of arbitrary length loops in proteins using InteractiveRosetta

BMC Bioinformatics. 2018 Sep 24;19(1):337. doi: 10.1186/s12859-018-2345-5.

Abstract

Background: With increasing interest in ab initio protein design, there is a desire to be able to fully explore the design space of insertions and deletions. Nature inserts and deletes residues to optimize energy and function, but allowing variable length indels in the context of an interactive protein design session presents challenges with regard to speed and accuracy.

Results: Here we present a new module (INDEL) for InteractiveRosetta which allows the user to specify a range of lengths for a desired indel, and which returns a set of low energy backbones in a matter of seconds. To make the loop search fast, loop anchor points are geometrically hashed using C α-C α and C β-C β distances, and the hash is mapped to start and end points in a pre-compiled random access file of non-redundant, protein backbone coordinates. Loops with superposable anchors are filtered for collisions and returned to InteractiveRosetta as poly-alanine for display and selective incorporation into the design template. Sidechains can then be added using RosettaDesign tools.

Conclusions: INDEL was able to find viable loops in 100% of 500 attempts for all lengths from 3 to 20 residues. INDEL has been applied to the task of designing a domain-swapping loop for T7-endonuclease I, changing its specificity from Holliday junctions to paranemic crossover (PX) DNA.

Keywords: Bystroff; Indel; InteractiveRosetta; Loop modeling; Protein design; PyRosetta; Rosetta; Simulation; T7 endonuclease I.

MeSH terms

  • Genetic Engineering
  • INDEL Mutation / genetics
  • Models, Molecular
  • Protein Domains
  • Protein Multimerization
  • Protein Structure, Secondary
  • Proteins / chemistry*
  • Software*
  • Time Factors

Substances

  • Proteins