Fast matching of transcription factor motifs using generalized position weight matrix models

J Comput Biol. 2013 Sep;20(9):621-30. doi: 10.1089/cmb.2012.0289. Epub 2013 Aug 6.

Abstract

The problem of finding the locations in DNA sequences that match a given motif describing the binding specificities of a transcription factor (TF) has many applications in computational biology. This problem has been extensively studied when the position weight matrix (PWM) model is used to represent motifs. We investigate it under the feature motif model, a generalization of the PWM model that does not assume independence between positions in the pattern while being compatible with the original PWM. We present a new method for finding the binding sites of a transcription factor in a DNA sequence when the feature motif model is used to describe transcription factor binding specificities. The experimental results on random and real data show that the search algorithm is fast in practice.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Motifs
  • Computational Biology / methods
  • Models, Genetic*
  • Response Elements / genetics*
  • Transcription Factors / genetics*

Substances

  • Transcription Factors