Pairs of Mutually Compensatory Frameshifting Mutations Contribute to Protein Evolution

Mol Biol Evol. 2022 Mar 2;39(3):msac031. doi: 10.1093/molbev/msac031.

Abstract

Insertions and deletions of lengths not divisible by 3 in protein-coding sequences cause frameshifts that usually induce premature stop codons and may carry a high fitness cost. However, this cost can be partially offset by a second compensatory indel restoring the reading frame. The role of such pairs of compensatory frameshifting mutations (pCFMs) in evolution has not been studied systematically. Here, we use whole-genome alignments of protein-coding genes of 100 vertebrate species, and of 122 insect species, studying the prevalence of pCFMs in their divergence. We detect a total of 624 candidate pCFM genes; six of them pass stringent quality filtering, including three human genes: RAB36, ARHGAP6, and NCR3LG1. In some instances, amino acid substitutions closely predating or following pCFMs restored the biochemical similarity of the frameshifted segment to the ancestral amino acid sequence, possibly reducing or negating the fitness cost of the pCFM. Typically, however, the biochemical similarity of the frameshifted sequence to the ancestral one was not higher than the similarity of a random sequence of a protein-coding gene to its frameshifted version, indicating that pCFMs can uncover radically novel regions of protein space. In total, pCFMs represent an appreciable and previously overlooked source of novel variation in amino acid sequences.

Keywords: compensatory evolution; evolution of novelty; frameshifting indels.

MeSH terms

  • Amino Acid Sequence
  • Humans
  • INDEL Mutation*
  • Mutation
  • Open Reading Frames
  • Proteins* / genetics

Substances

  • Proteins