Characterization and Genomic Localization of a SMAD4 Processed Pseudogene

J Mol Diagn. 2017 Nov;19(6):933-940. doi: 10.1016/j.jmoldx.2017.08.002. Epub 2017 Sep 1.

Abstract

Like many clinical diagnostic laboratories, the Yorkshire Regional Genetics Service undertakes routine investigation of cancer-predisposed individuals by high-throughput sequencing of patient DNA that has been target-enriched for genes associated with hereditary cancer. Accurate diagnosis using such reagents requires alertness regarding rare nonpathogenic variants that may interfere with variant calling. In a cohort of 2042 such cases, we identified 5 that initially appeared to be carriers of a 95-bp deletion of SMAD4 intron 6. More detailed analysis indicated that these individuals all carried one copy of a SMAD4 processed gene. Because of its interference with diagnostic analysis, we characterized this processed gene in detail. Whole-genome sequencing and confirmatory Sanger sequencing of junction PCR products were used to show that in each of the 5 cases, the SMAD4 processed gene was integrated at the same position on chromosome 9, located within the last intron of the SCAI gene. This rare polymorphic processed gene therefore reflects the occurrence of a single ancestral retrotransposition event. Compared to the reference SMAD4 mRNA sequence NM_005359.5 (https://www.ncbi.nlm.nih.gov/nucleotide), the 5' and 3' untranslated regions of the processed gene are both truncated, but its open reading frame is unaltered. Our experience leads us to advocate the use of an RNA-seq aligner as part of diagnostic assay quality assurance, since this allows recognition of processed pseudogenes in a comparatively facile automated fashion.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromosome Mapping
  • Chromosomes, Human, Pair 9 / genetics
  • Genomics
  • Heterozygote
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Introns / genetics
  • Neoplasms / diagnosis
  • Neoplasms / genetics*
  • Neoplasms / pathology
  • Pathology, Molecular / methods
  • Pseudogenes / genetics
  • Smad4 Protein / genetics*
  • Transcription Factors / genetics*
  • Whole Genome Sequencing

Substances

  • SCAI protein, human
  • SMAD4 protein, human
  • Smad4 Protein
  • Transcription Factors