Consequences of Substitution Model Selection on Protein Ancestral Sequence Reconstruction

Mol Biol Evol. 2022 Jul 2;39(7):msac144. doi: 10.1093/molbev/msac144.

Abstract

The selection of the best-fitting substitution model of molecular evolution is a traditional step for phylogenetic inferences, including ancestral sequence reconstruction (ASR). However, a few recent studies suggested that applying this procedure does not affect the accuracy of phylogenetic tree reconstruction. Here, we revisited this debate topic by analyzing the influence of selection among substitution models of protein evolution, with focus on exchangeability matrices, on the accuracy of ASR using simulated and real data. We found that the selected best-fitting substitution model produces the most accurate ancestral sequences, especially if the data present large genetic diversity. Indeed, ancestral sequences reconstructed under substitution models with similar exchangeability matrices were similar, suggesting that if the selected best-fitting model cannot be used for the reconstruction, applying a model similar to the selected one is preferred. We conclude that selecting among substitution models of protein evolution is recommended for reconstructing accurate ancestral sequences.

Keywords: ancestral sequence reconstruction; molecular evolution; phylogenetics; protein evolution; substitution model selection; substitution models of protein evolution.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Evolution, Molecular*
  • Mutation, Missense
  • Phylogeny
  • Proteins / genetics

Substances

  • Proteins