Accurate identification of novel human genes through simultaneous gene prediction in human, mouse, and rat

Genome Res. 2004 Apr;14(4):661-4. doi: 10.1101/gr.1939804.

Abstract

We describe a new method for simultaneously identifying novel homologous genes with identical structure in the human, mouse, and rat genomes by combining pairwise predictions made with the SLAM gene-finding program. Using this method, we found 3698 gene triples in the human, mouse, and rat genomes which are predicted with exactly the same gene structure. We show, both computationally and experimentally, that the introns of these triples are predicted accurately as compared with the introns of other ab initio gene prediction sets. Computationally, we compared the introns of these gene triples, as well as those from other ab initio gene finders, with known intron annotations. We show that a unique property of SLAM, namely that it predicts gene structures simultaneously in two organisms, is key to producing sets of predictions that are highly accurate in intron structure when combined with other programs. Experimentally, we performed reverse transcription-polymerase chain reaction (RT-PCR) in both the human and rat to test the exon pairs flanking introns from a subset of the gene triples for which the human gene had not been previously identified. By performing RT-PCR on orthologous introns in both the human and rat genomes, we additionally explore the validity of using RT-PCR as a method for confirming gene predictions.

Publication types

  • Comparative Study
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Animals
  • Chromosome Mapping / methods
  • Computational Biology / methods
  • Databases, Genetic
  • Exons / genetics
  • Genes / genetics*
  • Genome
  • Genome, Human
  • Humans
  • Introns / genetics
  • Mice
  • Predictive Value of Tests
  • Rats
  • Sequence Homology, Nucleic Acid
  • Software