Direct haplotype-resolved 5-base HiFi sequencing for genome-wide profiling of hypermethylation outliers in a rare disease cohort

Nat Commun. 2023 May 29;14(1):3090. doi: 10.1038/s41467-023-38782-1.

Abstract

Long-read HiFi genome sequencing allows for accurate detection and direct phasing of single nucleotide variants, indels, and structural variants. Recent algorithmic development enables simultaneous detection of CpG methylation for analysis of regulatory element activity directly in HiFi reads. We present a comprehensive haplotype resolved 5-base HiFi genome sequencing dataset from a rare disease cohort of 276 samples in 152 families to identify rare (~0.5%) hypermethylation events. We find that 80% of these events are allele-specific and predicted to cause loss of regulatory element activity. We demonstrate heritability of extreme hypermethylation including rare cis variants associated with short (~200 bp) and large hypermethylation events (>1 kb), respectively. We identify repeat expansions in proximal promoters predicting allelic gene silencing via hypermethylation and demonstrate allelic transcriptional events downstream. On average 30-40 rare hypermethylation tiles overlap rare disease genes per patient, providing indications for variation prioritization including a previously undiagnosed pathogenic allele in DIP2B causing global developmental delay. We propose that use of HiFi genome sequencing in unsolved rare disease cases will allow detection of unconventional diseases alleles due to loss of regulatory element activity.

MeSH terms

  • Base Sequence
  • DNA Methylation* / genetics
  • Haplotypes
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Nerve Tissue Proteins / genetics
  • Rare Diseases* / genetics
  • Sequence Analysis, DNA

Substances

  • DIP2B protein, human
  • Nerve Tissue Proteins