Accurate Sequencing and Haplotyping from 10 Cells Using Long Fragment Read (LFR) Technology

Methods Mol Biol. 2023:2590:71-84. doi: 10.1007/978-1-0716-2819-5_5.

Abstract

In this chapter, we describe how Long Fragment Read (LFR) technology can be applied to samples consisting of very few cells (5-20) to enable complete genome sequencing and haplotyping with a very low false positive error rate. LFR is a method for processing DNA or cells prior to sequencing on any second-generation DNA sequencing platform (e.g., MGI's DNBSEQ, Illumina sequencers, etc.). First, the LFR process incorporates a low-bias whole genome amplification step allowing accurate sequencing from very low DNA inputs (as low as 32 picograms, the mass contained within 5 diploid human cells). In addition, LFR enables the haplotyping of nearly all genomic variations with N50 contig lengths up to ~1 Mb. Furthermore, if data from this method are analyzed with parental genotype data, it is possible to generate phased variants in uninterrupted contigs spanning entire chromosomes. Importantly, the barcoding process utilized in this method allows for the detection and correction of most amplification, sequencing, and mapping errors, yielding false positive error rates as low as 10-9. Finally, the cost of this method is modest and enables extremely high-quality whole genome sequence and haplotype data from as few as 5 cells. We know of few other methods that can achieve this.

Keywords: 10 cells; Long DNA molecules; co-barcoding; error correction; experimental haplotyping; phasing; whole genome sequencing; whole-genome amplification (WGA).

MeSH terms

  • DNA
  • Genome, Human*
  • Haplotypes / genetics
  • High-Throughput Nucleotide Sequencing* / methods
  • Humans
  • Sequence Analysis, DNA / methods
  • Technology

Substances

  • DNA