QSAR study for mycobacterial promoters with low sequence homology

Bioorg Med Chem Lett. 2006 Feb;16(3):547-53. doi: 10.1016/j.bmcl.2005.10.057. Epub 2005 Nov 4.

Abstract

The general belief is that quantitative structure-activity relationship (QSAR) techniques work only for small molecules and, protein sequences or, more recently, DNA sequences. However, with non-branched graph for proteins and DNA sequences the QSAR often have to be based on powerful non-linear techniques such as support vector machines. In our opinion, linear QSAR models based on RNA could be useful to assign biological activity when alignment techniques fail due to low sequence homology. The idea bases the high level of branching for the RNA graph. This work introduces the so-called Markov electrostatic potentials (k)xi(M) as a new class of RNA 2D-structure descriptors. Subsequently, we validate these molecular descriptors solving a QSAR classification problem for mycobacterial promoter sequences (mps), which constitute a very low sequence homology problem. The model developed (mps=-4.664.(0)xi(M)+0. 991.(1)xi(M)-2.432) was intended to predict whether a naturally occurring sequence is an mps or not on the basis of the calculated (k)xi(M) value for the corresponding RNA secondary structure. The RNA-QSAR approach recognises 115/135mps (85.2%) and 100% of control sequences. Average predictability and robustness were greater than 95%. A previous non-linear model predicts mps with a slightly higher accuracy (97%) but uses a very large parameter space for DNA sequences. Conversely, the (k)xi(M)-based RNA-QSAR encodes more structural information and needs only two variables.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacterial Proteins / physiology
  • Base Sequence
  • DNA, Bacterial / chemistry*
  • DNA, Bacterial / genetics
  • Discriminant Analysis
  • Drug Design
  • Markov Chains
  • Models, Biological
  • Molecular Sequence Data
  • Mycobacterium / genetics*
  • Promoter Regions, Genetic*
  • Quantitative Structure-Activity Relationship
  • RNA / analysis
  • RNA / chemistry
  • Sequence Homology*
  • Static Electricity

Substances

  • Bacterial Proteins
  • DNA, Bacterial
  • RNA