70ProPred: a predictor for discovering sigma70 promoters based on combining multiple features

BMC Syst Biol. 2018 Apr 24;12(Suppl 4):44. doi: 10.1186/s12918-018-0570-1.

Abstract

Background: Promoter is an important sequence regulation element, which is in charge of gene transcription initiation. In prokaryotes, σ70 promoters regulate the transcription of most genes. The promoter recognition has been a crucial part of gene structure recognition. It's also the core issue of constructing gene transcriptional regulation network. With the successfully completion of genome sequencing from an increasing number of microbe species, the accurate identification of σ70 promoter regions in DNA sequence is not easy.

Results: In order to improve the prediction accuracy of sigma70 promoters in prokaryote, a promoter recognition model 70ProPred was established. In this work, two sequence-based features, including position-specific trinucleotide propensity based on single-stranded characteristic (PSTNPss) and electron-ion potential values for trinucleotides (PseEIIP), were assessed to build the best prediction model. It was found that 79 features of PSTNPSS combined with 64 features of PseEIIP obtained the best performance for sigma70 promoter identification, with a promising accuracy and the Matthews correlation coefficient (MCC) at 95.56% and 0.90, respectively.

Conclusion: The jackknife tests showed that 70ProPred outperforms the existing sigma70 promoter prediction approaches in terms of accuracy and stability. Additionally, this approach can also be extended to predict promoters of other species. In order to facilitate experimental biologists, an online web server for the proposed method was established, which is freely available at http://server.malab.cn/70ProPred/ .

Keywords: PSTNPSS; PseEIIP; SVM; sigma70 promoter.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • DNA, Single-Stranded / genetics
  • DNA, Single-Stranded / metabolism
  • DNA-Directed RNA Polymerases / metabolism*
  • Promoter Regions, Genetic*
  • Protein Binding
  • Sigma Factor / metabolism*
  • Transcription, Genetic

Substances

  • DNA, Single-Stranded
  • Sigma Factor
  • RNA polymerase sigma 70
  • DNA-Directed RNA Polymerases