Sequence alignment kernel for recognition of promoter regions

Bioinformatics. 2003 Oct 12;19(15):1964-71. doi: 10.1093/bioinformatics/btg265.

Abstract

In this paper we propose a new method for recognition of prokaryotic promoter regions with startpoints of transcription. The method is based on Sequence Alignment Kernel, a function reflecting the quantitative measure of match between two sequences. This kernel function is further used in Dual SVM, which performs the recognition. Several recognition methods have been trained and tested on positive data set, consisting of 669 sigma70-promoter regions with known transcription startpoints of Escherichia coli and two negative data sets of 709 examples each, taken from coding and non-coding regions of the same genome. The results show that our method performs well and achieves 16.5% average error rate on positive & coding negative data and 18.6% average error rate on positive & non-coding negative data.

Availability: The demo version of our method is accessible from our website http://mendel.cs.rhul.ac.uk/

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Algorithms*
  • Artificial Intelligence*
  • Escherichia coli / genetics*
  • Gene Expression Profiling / methods*
  • Pattern Recognition, Automated*
  • Promoter Regions, Genetic / genetics*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / methods*