Integrating genomic homology into gene structure prediction

Bioinformatics. 2001:17 Suppl 1:S140-8. doi: 10.1093/bioinformatics/17.suppl_1.s140.

Abstract

TWINSCAN is a new gene-structure prediction system that directly extends the probability model of GENSCAN, allowing it to exploit homology between two related genomes. Separate probability models are used for conservation in exons, introns, splice sites, and UTRs, reflecting the differences among their patterns of evolutionary conservation. TWINSCAN is specifically designed for the analysis of high-throughput genomic sequences containing an unknown number of genes. In experiments on high-throughput mouse sequences, using homologous sequences from the human genome, TWINSCAN shows notable improvement over GENSCAN in exon sensitivity and specificity and dramatic improvement in exact gene sensitivity and specificity. This improvement can be attributed entirely to modeling the patterns of evolutionary conservation in genomic sequence.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms*
  • Animals
  • Base Sequence
  • Computational Biology
  • Conserved Sequence
  • DNA / genetics
  • Evolution, Molecular
  • Genome*
  • Genome, Human
  • Humans
  • Mice
  • Models, Statistical
  • Sensitivity and Specificity
  • Sequence Alignment / statistics & numerical data*
  • Sequence Homology, Nucleic Acid

Substances

  • DNA