Siberian larch (Larix sibirica Ledeb.) mitochondrial genome assembled using both short and long nucleotide sequence reads is currently the largest known mitogenome

BMC Genomics. 2020 Sep 23;21(1):654. doi: 10.1186/s12864-020-07061-4.

Abstract

Background: Plant mitochondrial genomes (mitogenomes) can be structurally complex while their size can vary from ~ 222 Kbp in Brassica napus to 11.3 Mbp in Silene conica. To date, in comparison with the number of plant species, only a few plant mitogenomes have been sequenced and released, particularly for conifers (the Pinaceae family). Conifers cover an ancient group of land plants that includes about 600 species, and which are of great ecological and economical value. Among them, Siberian larch (Larix sibirica Ledeb.) represents one of the keystone species in Siberian boreal forests. Yet, despite its importance for evolutionary and population studies, the mitogenome of Siberian larch has not yet been assembled and studied.

Results: Two sources of DNA sequences were used to search for mitochondrial DNA (mtDNA) sequences: mtDNA enriched samples and nucleotide reads generated in the de novo whole genome sequencing project, respectively. The assembly of the Siberian larch mitogenome contained nine contigs, with the shortest and the largest contigs being 24,767 bp and 4,008,762 bp, respectively. The total size of the genome was estimated at 11.7 Mbp. In total, 40 protein-coding, 34 tRNA, and 3 rRNA genes and numerous repetitive elements (REs) were annotated in this mitogenome. In total, 864 C-to-U RNA editing sites were found for 38 out of 40 protein-coding genes. The immense size of this genome, currently the largest reported, can be partly explained by variable numbers of mobile genetic elements, and introns, but unlikely by plasmid-related sequences. We found few plasmid-like insertions representing only 0.11% of the entire Siberian larch mitogenome.

Conclusions: Our study showed that the size of the Siberian larch mitogenome is much larger than in other so far studied Gymnosperms, and in the same range as for the annual flowering plant Silene conica (11.3 Mbp). Similar to other species, the Siberian larch mitogenome contains relatively few genes, and despite its huge size, the repeated and low complexity regions cover only 14.46% of the mitogenome sequence.

Keywords: Larix sibirica; Long reads; Mitochondrial genome; Nucleotide sequence; RNA editing; mtDNA.

MeSH terms

  • Contig Mapping
  • Genome Size*
  • Genome, Mitochondrial*
  • Larix / genetics*
  • Molecular Sequence Annotation
  • Plant Proteins / genetics
  • RNA, Ribosomal / genetics
  • RNA, Transfer / genetics
  • Repetitive Sequences, Nucleic Acid

Substances

  • Plant Proteins
  • RNA, Ribosomal
  • RNA, Transfer