Analysis of context of 5'-splice site sequences in mammalian mRNA precursors by subclass method

Comput Appl Biosci. 1992 Aug;8(4):367-76. doi: 10.1093/bioinformatics/8.4.367.

Abstract

The signals that direct the excision of introns from mammalian pre-mRNA are not yet well understood. However, at least three kinds of signals--5'-splice site signals, 3'-splice site signals and branch point signals--play important roles in the excision of introns. In the present paper we treat only the 5'-splice sites. In addition to a consensus sequence for 5'-splice signals, several methods have been proposed, based on a statistical model, and used to analyze relative importance of each nucleotide at each position. In our approach a nucleotide sequence is regarded as a string with symbols of 'A', 'T', 'G' and 'C'; important substrings of 5'-splice site sequences, called pattern sequences, are extracted. A pattern sequence expresses which nucleotide is needed at a limited number of positions around the 5'-splice site. It is observed that a particular pattern sequence matches predominantly 5'-splice site sequences nearest to the 5'-end of a gene and another pattern sequence matches predominantly the second nearest ones. Moreover, it is confirmed that the pattern sequences accurately predict authentic 5'-splice sites for unknown genes and explain some mutation examples.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Base Sequence
  • Mammals
  • Molecular Sequence Data
  • Mutation
  • Pattern Recognition, Automated
  • RNA Splicing / genetics*