ESTprep: preprocessing cDNA sequence reads

Bioinformatics. 2003 Jul 22;19(11):1318-24. doi: 10.1093/bioinformatics/btg159.

Abstract

Motivation: High accuracy of data always governs the large-scale gene discovery projects. The data should not only be trustworthy but should be correctly annotated for various features it contains. Sequence errors are inherent in single-pass sequences such as ESTs obtained from automated sequencing. These errors further complicate the automated identification of EST-related sequencing. A tool is required to prepare the data prior to advanced annotation processing and submission to public databases.

Results: This paper describes ESTprep, a program designed to preprocess expressed sequence tag (EST) sequences. It identifies the location of features present in ESTs and allows the sequence to pass only if it meets various quality criteria. Use of ESTprep has resulted in substantial improvement in accurate EST feature identification and fidelity of results submitted to GenBank.

Availability: The program is freely available for download from http://genome.uiowa.edu/pubsoft/software.html

Publication types

  • Comparative Study
  • Evaluation Study
  • Validation Study

MeSH terms

  • Algorithms*
  • Base Sequence
  • DNA, Complementary / chemistry*
  • DNA, Complementary / genetics*
  • Expressed Sequence Tags
  • Gene Expression Profiling / methods*
  • Molecular Sequence Data
  • Quality Control
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / methods*
  • Software*

Substances

  • DNA, Complementary