Inferring Indel Parameters using a Simulation-based Approach

Genome Biol Evol. 2015 Nov 3;7(12):3226-38. doi: 10.1093/gbe/evv212.

Abstract

In this study, we present a novel methodology to infer indel parameters from multiple sequence alignments (MSAs) based on simulations. Our algorithm searches for the set of evolutionary parameters describing indel dynamics which best fits a given input MSA. In each step of the search, we use parametric bootstraps and the Mahalanobis distance to estimate how well a proposed set of parameters fits input data. Using simulations, we demonstrate that our methodology can accurately infer the indel parameters for a large variety of plausible settings. Moreover, using our methodology, we show that indel parameters substantially vary between three genomic data sets: Mammals, bacteria, and retroviruses. Finally, we demonstrate how our methodology can be used to simulate MSAs based on indel parameters inferred from real data sets.

Keywords: Mahalanobis distance; alignments; indels; phylogeny; simulations.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Genome, Bacterial
  • Genome, Viral
  • INDEL Mutation*
  • Mammals
  • Sequence Alignment / methods*
  • Software*