Detecting periodic patterns in biological sequences

Bioinformatics. 1998;14(6):498-507. doi: 10.1093/bioinformatics/14.6.498.

Abstract

Motivation: The search for repeated patterns in DNA and protein sequences is important in sequence analysis. The rapid increase in available sequences, in particular from large-scale genome sequencing projects, makes it relevant to develop sensitive automatic methods for the identification of repeats.

Results: A new method for finding periodic patterns in biological sequences is presented. The method is based on evolutionary distance and 'phase shifts' corresponding to insertions and deletions. A given sequence is aligned to itself in a certain sense, trying to minimize a distance to periodicity. Relationships between different such periodicity measures are discussed. An iterative algorithm is used, and the running time is nearly proportional to the sequence length. The alignment produces a periodic consensus pattern. A 'phase score' is used to indicate a statistical significance of the periodicity. Three examples using both DNA and protein sequences illustrate how the method can be used to find patterns.

Availability: On request from the authors.

Contact: evindc@mat nu.no; finn.drablos@unimed.sintef.no

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Animals
  • Base Sequence
  • Computational Biology
  • DNA / chemistry
  • DNA / genetics
  • Enzyme Inhibitors / chemistry
  • Intracellular Signaling Peptides and Proteins
  • Models, Molecular
  • Molecular Sequence Data
  • Pattern Recognition, Automated
  • Protein Conformation
  • Proteins / chemistry
  • Proteins / genetics
  • Repetitive Sequences, Amino Acid*
  • Repetitive Sequences, Nucleic Acid*
  • Ribonucleases / antagonists & inhibitors
  • Sequence Alignment / methods
  • Sequence Alignment / statistics & numerical data
  • Sequence Analysis / methods*
  • Sequence Analysis / statistics & numerical data

Substances

  • Enzyme Inhibitors
  • Intracellular Signaling Peptides and Proteins
  • Proteins
  • ribonuclease inhibitor, porcine
  • DNA
  • Ribonucleases