Pooled library tissue tags for EST-based gene discovery

Bioinformatics. 2002 Sep;18(9):1162-6. doi: 10.1093/bioinformatics/18.9.1162.

Abstract

Motivation: In gene discovery projects based on EST sequencing, effective post-sequencing identification methods are important in determining tissue sources of ESTs within pooled cDNA libraries. In the past, such identification efforts have been characterized by higher than necessary failure rates due to the presence of errors within the subsequence containing the oligo tag intended to define the tissue source for each EST.

Results: A large-scale EST-based gene discovery program at The University of Iowa has led to the creation of a unique software method named UITagCreator usable in the creation of large sets of synthetic tissue identification tags. The identification tags provide error detection and correction capability and, in conjunction with automated annotation software, result in a substantial improvement in the accurate identification of the tissue source in the presence of sequencing and base-calling errors. These identification rates are favorable, relative to past paradigms.

Availability: The UITagCreator source code and installation instructions, along with detection software usable in concert with created tag sets, is freely available at http://genome.uiowa.edu/pubsoft/software.html

Contact: tomc@eng.uiowa.edu

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms
  • Animals
  • Base Sequence
  • Database Management Systems*
  • Expressed Sequence Tags*
  • Gene Library*
  • Information Storage and Retrieval / methods*
  • Models, Genetic
  • Models, Statistical
  • Molecular Sequence Data
  • Rats
  • Reproducibility of Results
  • Sequence Analysis, DNA
  • Sequence Tagged Sites
  • Software*