Nanopore Sequencing Resolves Elusive Long Tandem-Repeat Regions in Mitochondrial Genomes

Int J Mol Sci. 2021 Feb 11;22(4):1811. doi: 10.3390/ijms22041811.

Abstract

Long non-coding, tandem-repetitive regions in mitochondrial (mt) genomes of many metazoans have been notoriously difficult to characterise accurately using conventional sequencing methods. Here, we show how the use of a third-generation (long-read) sequencing and informatic approach can overcome this problem. We employed Oxford Nanopore technology to sequence genomic DNAs from a pool of adult worms of the carcinogenic parasite, Schistosoma haematobium, and used an informatic workflow to define the complete mt non-coding region(s). Using long-read data of high coverage, we defined six dominant mt genomes of 33.4 kb to 22.6 kb. Although no variation was detected in the order or lengths of the protein-coding genes, there was marked length (18.5 kb to 7.6 kb) and structural variation in the non-coding region, raising questions about the evolution and function of what might be a control region that regulates mt transcription and/or replication. The discovery here of the largest tandem-repetitive, non-coding region (18.5 kb) in a metazoan organism also raises a question about the completeness of some of the mt genomes of animals reported to date, and stimulates further explorations using a Nanopore-informatic workflow.

Keywords: Oxford Nanopore technology; Schistosoma haematobium; informatics; mitochondrial (mt) genome; non-coding (control) region; tandem-repetitive DNA.

MeSH terms

  • Animals
  • Genome, Helminth*
  • Genome, Mitochondrial*
  • Nanopore Sequencing*
  • Schistosoma haematobium / genetics*
  • Tandem Repeat Sequences*