Assembly and RNA-free annotation of highly heterozygous genomes: The case of the thick-billed murre (Uria lomvia)

Mol Ecol Resour. 2018 Jan;18(1):79-90. doi: 10.1111/1755-0998.12712. Epub 2017 Sep 18.

Abstract

Thanks to a dramatic reduction in sequencing costs followed by a rapid development of bioinformatics tools, genome assembly and annotation have become accessible to many researchers in recent years. Among tetrapods, birds have genomes that display many features that facilitate their assembly and annotation, such as small genome size, low number of repeats and highly conserved genomic structure. However, we found that high genomic heterozygosity could have a great impact on the quality of the genome assembly of the thick-billed murre (Uria lomvia), an arctic colonial seabird. In this study, we tested the performance of three genome assemblers, ray/sscape, soapdenovo2 and platanus, in assembling the highly heterozygous genome of the thick-billed murre. Our results show that platanus, an assembler specifically designed for heterozygous genomes, outperforms the other two approaches and produces a highly contiguous (N50 = 15.8 Mb) and complete genome assembly (93% presence of genes from the Benchmarking Universal Single Copy Ortholog [BUSCO] gene set). Additionally, we annotated the thick-billed murre genome using a homology-based approach that takes advantage of the genomic resources available for birds and other taxa. Our study will be useful for those researchers who are approaching assembly and annotation of highly heterozygous genomes, or genomes of species of conservation concern, and/or who have limited financial resources.

Keywords: birds; gene annotation; genome assembler; heterozygosity; nonmodel organism; whole-genome sequencing.

Publication types

  • Comparative Study
  • Evaluation Study

MeSH terms

  • Animals
  • Charadriiformes / genetics*
  • Computational Biology / methods*
  • Genome*
  • Heterozygote*
  • Molecular Sequence Annotation / methods*