A clinically validated whole genome pipeline for structural variant detection and analysis

BMC Genomics. 2019 Jul 16;20(Suppl 8):545. doi: 10.1186/s12864-019-5866-z.

Abstract

Background: With the continuing decrease in cost of whole genome sequencing (WGS), we have already approached the point of inflection where WGS testing has become economically feasible, facilitating broader access to the benefits that are helping to define WGS as the new diagnostic standard. WGS provides unique opportunities for detection of structural variants; however, such analyses, despite being recognized by the research community, have not previously made their way into routine clinical practice.

Results: We have developed a clinically validated pipeline for highly specific and sensitive detection of structural variants basing on 30X PCR-free WGS. Using a combination of breakpoint analysis of split and discordant reads, and read depth analysis, the pipeline identifies structural variants down to single base pair resolution. False positives are minimized using calculations for loss of heterozygosity and bi-modal heterozygous variant allele frequencies to enhance heterozygous deletion and duplication detection respectively. Compound and potential compound combinations of structural variants and small sequence changes are automatically detected. To facilitate clinical interpretation, identified variants are annotated with phenotype information derived from HGMD Professional and population allele frequencies derived from public and Variantyx allele frequency databases. Single base pair resolution enables easy visual inspection of potentially causal variants using the IGV genome browser as well as easy biochemical validation via PCR. Analytical and clinical sensitivity and specificity of the pipeline has been validated using analysis of Genome in a Bottle reference genomes and known positive samples confirmed by orthogonal sequencing technologies.

Conclusion: Consistent read depth of PCR-free WGS enables reliable detection of structural variants of any size. Annotation both on gene and variant level allows clinicians to match reported patient phenotype with detected variants and confidently report causative finding in all clinical cases used for validation.

Keywords: Break point; CNV; Clinical validation; Deletion; Diagnostic console; Duplication; Pipeline; Structural variants; WGS; Whole genome sequencing.

MeSH terms

  • Gene Frequency
  • Genetic Variation*
  • Humans
  • Molecular Sequence Annotation
  • Phenotype
  • Reproducibility of Results
  • Whole Genome Sequencing / methods*