Analysis of Large-Scale Mutagenesis Data To Assess the Impact of Single Amino Acid Substitutions

Genetics. 2017 Sep;207(1):53-61. doi: 10.1534/genetics.117.300064. Epub 2017 Jul 27.

Abstract

Mutagenesis is a widely used method for identifying protein positions that are important for function or ligand binding. Advances in high-throughput DNA sequencing and mutagenesis techniques have enabled measurement of the effects of nearly all possible amino acid substitutions in many proteins. The resulting large-scale mutagenesis data sets offer a unique opportunity to draw general conclusions about the effects of different amino acid substitutions. Thus, we analyzed 34,373 mutations in 14 proteins whose effects were measured using large-scale mutagenesis approaches. Methionine was the most tolerated substitution, while proline was the least tolerated. We found that several substitutions, including histidine and asparagine, best recapitulated the effects of other substitutions, even when the identity of the wild-type amino acid was considered. The effects of histidine and asparagine substitutions also correlated best with the effects of other substitutions in different structural contexts. Furthermore, highly disruptive substitutions like aspartic and glutamic acid had the most discriminatory power for detecting ligand interface positions. Our work highlights the utility of large-scale mutagenesis data, and our conclusions can help guide future single substitution mutational scans.

Keywords: deep mutational scanning; molecular biology; mutations; scanning mutagenesis.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Amino Acid Substitution / genetics*
  • Amino Acids / genetics
  • Genome, Human*
  • Humans
  • Models, Genetic*
  • Mutation Rate

Substances

  • Amino Acids