The use of consensus sequence information to engineer stability and activity in proteins

Methods Enzymol. 2020:643:149-179. doi: 10.1016/bs.mie.2020.06.001. Epub 2020 Jul 17.

Abstract

The goal of protein design is to create proteins that are stable, soluble, and active. Here we focus on one approach to protein design in which sequence information is used to create a "consensus" sequence. Such consensus sequences comprise the most common residue at each position in a multiple sequence alignment (MSA). After describing some general ideas that relate MSA and consensus sequences and presenting a statistical thermodynamic framework that relates consensus and non-consensus sequences to stability, we detail the process of designing a consensus sequence and survey reports of consensus design and characterization from the literature. Many of these consensus proteins retain native biological activities including ligand binding and enzyme activity. Remarkably, in most cases the consensus protein shows significantly higher stability than extant versions of the protein, as measured by thermal or chemical denaturation, consistent with the statistical thermodynamic model. To understand this stability increase, we compare various features of consensus sequences with the extant MSA sequences from which they were derived. Consensus sequences show enrichment in charged residues (most notably glutamate and lysine) and depletion of uncharged polar residues (glutamine, serine, and asparagine). Surprisingly, a survey of stability changes resulting from point substitutions show little correlation with residue frequencies at the corresponding positions within the MSA, suggesting that the high stability of consensus proteins may result from interactions among residue pairs or higher-order clusters. Whatever the source, the large number of reported successes demonstrates that consensus design is a viable route to generating active and in many cases highly stabilized proteins.

Keywords: Bioinformatics; Consensus sequences; Multiple sequence alignments; Protein design; Protein engineering; Protein stability.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Amino Acid Sequence
  • Consensus Sequence
  • Proteins* / genetics
  • Sequence Alignment
  • Thermodynamics

Substances

  • Proteins