Maximum likelihood estimates of pairwise rearrangement distances

J Theor Biol. 2017 Jun 21:423:31-40. doi: 10.1016/j.jtbi.2017.04.015. Epub 2017 Apr 20.

Abstract

Accurate estimation of evolutionary distances between taxa is important for many phylogenetic reconstruction methods. Distances can be estimated using a range of different evolutionary models, from single nucleotide polymorphisms to large-scale genome rearrangements. Corresponding corrections for genome rearrangement distances fall into 3 categories: Empirical computational studies, Bayesian/MCMC approaches, and combinatorial approaches. Here, we introduce a maximum likelihood estimator for the inversion distance between a pair of genomes, using a group-theoretic approach to modelling inversions introduced recently. This MLE functions as a corrected distance: in particular, we show that because of the way sequences of inversions interact with each other, it is quite possible for minimal distance and MLE distance to differently order the distances of two genomes from a third. The second aspect tackles the problem of accounting for the symmetries of circular arrangements. While, generally, a frame of reference is locked, and all computation made accordingly, this work incorporates the action of the dihedral group so that distance estimates are free from any a priori frame of reference. The philosophy of accounting for symmetries can be applied to any existing correction method, for which examples are offered.

Keywords: Algebraic biology; Coset; Genome rearrangement; Group theory; Inversion; Maximum likelihood; Phylogeny.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Evolution, Molecular*
  • Genome / genetics*
  • Likelihood Functions
  • Phylogeny*
  • Spatial Analysis