Prediction of unidentified human genes on the basis of sequence similarity to novel cDNAs from cynomolgus monkey brain

Genome Biol. 2002;3(1):RESEARCH0006. doi: 10.1186/gb-2001-3-1-research0006. Epub 2001 Dec 19.

Abstract

Background: The complete assignment of the protein-coding regions of the human genome is a major challenge for genome biology today. We have already isolated many hitherto unknown full-length cDNAs as orthologs of unidentified human genes from cDNA libraries of the cynomolgus monkey (Macaca fascicularis) brain (parietal lobe and cerebellum). In this study, we used cDNA libraries of three other parts of the brain (frontal lobe, temporal lobe and medulla oblongata) to isolate novel full-length cDNAs.

Results: The entire sequences of novel cDNAs of the cynomolgus monkey were determined, and the orthologous human cDNA sequences were predicted from the human genome sequence. We predicted 29 novel human genes with putative coding regions sharing an open reading frame with the cynomolgus monkey, and we confirmed the expression of 21 pairs of genes by the reverse transcription-coupled polymerase chain reaction method. The hypothetical proteins were also functionally annotated by computer analysis.

Conclusions: The 29 new genes had not been discovered in recent explorations for novel genes in humans, and the ab initio method failed to predict all exons. Thus, monkey cDNA is a valuable resource for the preparation of a complete human gene catalog, which will facilitate post-genomic studies.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Brain / metabolism*
  • DNA, Complementary / chemistry
  • DNA, Complementary / genetics
  • Gene Expression Regulation
  • Genes / genetics
  • Genome, Human
  • Humans
  • Macaca fascicularis / genetics*
  • RNA / genetics*
  • Reverse Transcriptase Polymerase Chain Reaction
  • Sequence Alignment / methods
  • Sequence Analysis, DNA
  • Software

Substances

  • DNA, Complementary
  • RNA