Current methods of gene prediction, their strengths and weaknesses

Nucleic Acids Res. 2002 Oct 1;30(19):4103-17. doi: 10.1093/nar/gkf543.

Abstract

While the genomes of many organisms have been sequenced over the last few years, transforming such raw sequence data into knowledge remains a hard task. A great number of prediction programs have been developed that try to address one part of this problem, which consists of locating the genes along a genome. This paper reviews the existing approaches to predicting genes in eukaryotic genomes and underlines their intrinsic advantages and limitations. The main mathematical models and computational algorithms adopted are also briefly described and the resulting software classified according to both the method and the type of evidence used. Finally, the several difficulties and pitfalls encountered by the programs are detailed, showing that improvements are needed and that new directions must be considered.

Publication types

  • Review

MeSH terms

  • Algorithms
  • Alternative Splicing / genetics
  • Animals
  • Computational Biology / methods*
  • Expressed Sequence Tags
  • Genes / genetics
  • Genome*
  • Humans
  • Sequence Alignment / methods