Investigation of de novo totally random biosequences, Part II: On the folding frequency in a totally random library of de novo proteins obtained by phage display

Chem Biodivers. 2006 Aug;3(8):840-59. doi: 10.1002/cbdv.200690088.

Abstract

We present an investigation on theoretically possible protein structures which have not been selected by evolution and are, therefore, not present on our Earth ('Never Born Proteins' (NBP)). In particular, we attempt to assess whether and to what extent such polypeptides might be folded, thus acquiring a globular protein status. A library (ca. 10(9) clones) of totally random polypeptides, with a length of 50 amino acids, has been produced by phage display. The only structural bias in these sequences is a tripeptide substrate for thrombin: PRG, chosen according to the criteria described in the preceding Part I of this series. The presence of this substrate in an otherwise totally random sequence forms the basis for a qualitative experimental criterion which distinguishes unfolded from folded proteins, as folded proteins are more protected from protease digestion than unfolded ones. The investigation of 79 sequences, randomly selected from the initially large library, shows that over 20% of this population is thrombin-resistant, likely due to folding. Analysis of the amino acid sequences of these clones shows no significant homology to extant proteins, which indicates that they are indeed totally de novo. A few of these sequences have been expressed, and here we describe the structural properties of two thrombin-resistant randomly selected ones. These two de novo proteins have been characterized by spectroscopic methods and, in particular, by circular dichroism. The data show a stable three-dimensional folding, which is temperature-resistant and can be reversibly denatured by urea. The consequences of this finding within a library of 'Never Born Proteins' are discussed in terms of molecular evolution.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Circular Dichroism
  • Computational Biology
  • Computer Simulation
  • Escherichia coli / genetics
  • Escherichia coli / metabolism
  • Gene Expression
  • Models, Molecular
  • Molecular Sequence Data
  • Mutation / genetics
  • Oligopeptides / genetics
  • Peptide Library
  • Plasmids / genetics
  • Protein Folding*
  • Protein Structure, Tertiary
  • Proteins / chemistry*
  • Proteins / genetics
  • Proteins / metabolism*

Substances

  • Oligopeptides
  • Peptide Library
  • Proteins