Intron exon boundary junctions in human genome have in-built unique structural and energetic signals

Nucleic Acids Res. 2021 Mar 18;49(5):2674-2683. doi: 10.1093/nar/gkab098.

Abstract

Precise identification of correct exon-intron boundaries is a prerequisite to analyze the location and structure of genes. The existing framework for genomic signals, delineating exon and introns in a genomic segment, seems insufficient, predominantly due to poor sequence consensus as well as limitations of training on available experimental data sets. We present here a novel concept for characterizing exon-intron boundaries in genomic segments on the basis of structural and energetic properties. We analyzed boundary junctions on both sides of all the exons (3 28 368) of protein coding genes from human genome (GENCODE database) using 28 structural and three energy parameters. Study of sequence conservation at these sites shows very poor consensus. It is observed that DNA adopts a unique structural and energy state at the boundary junctions. Also, signals are somewhat different for housekeeping and tissue specific genes. Clustering of 31 parameters into four derived vectors gives some additional insights into the physical mechanisms involved in this biological process. Sites of structural and energy signals correlate well to the positions playing important roles in pre-mRNA splicing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Exons*
  • Genes, Essential
  • Genome, Human*
  • Genomics
  • Humans
  • Introns*
  • RNA Splice Sites

Substances

  • RNA Splice Sites