Resolving fine-grained dynamics of retrotransposons: comparative analysis of inferential methods and genomic resources

Plant J. 2017 Jun;90(5):979-993. doi: 10.1111/tpj.13524. Epub 2017 Apr 8.

Abstract

Transposable elements support genome diversification, but comparison of their proliferation and genomic distribution within and among species is necessary to characterize their role in evolution. Such inferences are challenging because of potential bias with incomplete sampling of repetitive genome regions. Here, using the assembled genome as well as genome skimming datasets in Arabis alpina, we assessed the limits of current approaches inferring the biology of transposable elements. Long terminal repeat retrotransposons (LTR-RTs) identified in the assembled genome were classified into monophyletic lineages (here called tribes), including families of similar copies in Arabis along with elements from related Brassicaceae. Inference of their dynamics using divergence of LTRs in full-length copies and mismatch distribution of genetic variation among all copies congruently highlighted recent transposition bursts, although ancient proliferation events were apparent only with mismatch distribution. Similar inferences of LTR-RT dynamics based on random sequences from genome skimming were highly correlated with assembly-based estimates, supporting accurate analyses from shallow sequencing. Proportions of LTR-RT copies next to genes from both assembled genomes and genome skimming were congruent, pointing to tribes being over- or under-represented in the vicinity of genes. Finally, genome skimming at low coverage revealed accurate inferences of LTR-RT dynamics and distribution, although only the most abundant families appeared robustly analysed at 0.1X. Examining the pitfalls and benefits of approaches relying on different genomic resources, we highlight that random sequencing reads represent adequate data suitably complementing biased samples of LTR-RT copies retrieved from assembled genomes towards comprehensive surveys of the biology of transposable elements.

Keywords: Arabis alpina; assembled genome; genome skimming; long terminal repeat retrotransposons; mismatch distribution; transposable elements; transposition burst.

MeSH terms

  • DNA Transposable Elements / genetics
  • Evolution, Molecular
  • Genetic Variation / genetics
  • Genome, Plant / genetics*
  • Genomics
  • Phylogeny
  • Plant Proteins / genetics
  • Retroelements / genetics*
  • Terminal Repeat Sequences / genetics*

Substances

  • DNA Transposable Elements
  • Plant Proteins
  • Retroelements