Background: Detecting heteroplasmic variations in the mitochondrial genome can help identify potential pathogenic possibilities, which is significant for disease prevention. The development of next-generation sequencing changed the quantification of mitochondrial DNA (mtDNA) heteroplasmy from scanning limited recorded points to the entire mitochondrial genome. However, due to the presence of nuclear mtDNA homologous sequences (nuMTs), maximally retaining real variations while excluding falsest heteroplasmic variations from nuMTs and sequencing errors presents a dilemma.
Results: Herein, we used an improved method for detecting low-frequency mtDNA heteroplasmic variations from whole genome sequencing data, including point variations and short-fragment length alterations, and evaluated the effect of this method. A two-step alignment was designed and performed to accelerate data processing, to obtain and retain the true mtDNA reads and to eliminate most nuMTs reads. After analyzing whole genome sequencing data of K562 and GM12878 cells, ~90% of heteroplasmic point variations were identified in MitoMap. The results were consistent with the results of an amplification refractory mutation system qPCR. Many linkages of the detected heteroplasmy variations were also discovered.
Conclusions: Our improved method is a simple, efficient and accurate way to mine mitochondrial low-frequency heteroplasmic variations from whole genome sequencing data. By evaluating the highest misalignment possibility caused by the remaining nuMTs-like reads and sequencing errors, our procedure can detect mtDNA heteroplasmic variations whose heteroplasmy frequencies are as low as 0.2%.
Keywords: Heteroplasmic linkages; Heteroplasmic variations; Mitochondrial genome; Next-generation sequencing; Whole genome sequencing.
Copyright © 2019 Elsevier B.V. All rights reserved.