Family-Level Sampling of Mitochondrial Genomes in Coleoptera: Compositional Heterogeneity and Phylogenetics

Genome Biol Evol. 2015 Dec 8;8(1):161-75. doi: 10.1093/gbe/evv241.

Abstract

Mitochondrial genomes are readily sequenced with recent technology and thus evolutionary lineages can be densely sampled. This permits better phylogenetic estimates and assessment of potential biases resulting from heterogeneity in nucleotide composition and rate of change. We gathered 245 mitochondrial sequences for the Coleoptera representing all 4 suborders, 15 superfamilies of Polyphaga, and altogether 97 families, including 159 newly sequenced full or partial mitogenomes. Compositional heterogeneity greatly affected 3rd codon positions, and to a lesser extent the 1st and 2nd positions, even after RY coding. Heterogeneity also affected the encoded protein sequence, in particular in the nad2, nad4, nad5, and nad6 genes. Credible tree topologies were obtained with the nhPhyML ("nonhomogeneous") algorithm implementing a model for branch-specific equilibrium frequencies. Likelihood searches using RAxML were improved by data partitioning by gene and codon position. Finally, the PhyloBayes software, which allows different substitution processes for amino acid replacement at various sites, produced a tree that best matched known higher level taxa and defined basal relationships in Coleoptera. After rooting with Neuropterida outgroups, suborder relationships were resolved as (Polyphaga (Myxophaga (Archostemata + Adephaga))). The infraorder relationships in Polyphaga were (Scirtiformia (Elateriformia ((Staphyliniformia + Scarabaeiformia) (Bostrichiformia (Cucujiformia))))). Polyphagan superfamilies were recovered as monophyla except Staphylinoidea (paraphyletic for Scarabaeiformia) and Cucujoidea, which can no longer be considered a valid taxon. The study shows that, although compositional heterogeneity is not universal, it cannot be eliminated for some mitochondrial genes, but dense taxon sampling and the use of appropriate Bayesian analyses can still produce robust phylogenetic trees.

Keywords: PhyloBayes; RY coding; long-range PCR; mitogenomes; mixture models; rogue taxa.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Coleoptera / classification
  • Coleoptera / genetics*
  • Genetic Heterogeneity*
  • Genome, Insect*
  • Genome, Mitochondrial*
  • Phylogeny*