Gap mapping: a paradigm for aligning two sequences

Appl Bioinformatics. 2003;2(3 Suppl):S31-5.

Abstract

Pairwise sequence alignment is one of the most essential tools in comparative genomic sequence analysis. It is used to compare the sequences of genes and proteins with the aim of inferring structural, functional and evolutionary relationships. However, current 'mainstream' alignment algorithms have optimisation criteria based primarily on computational efficiency using parameters such as gap penalties, which are not biologically motivated. In addition, current alignment algorithms such as the Smith and Waterman technique provide a single alignment that could be sensitive to rather arbitrary choices in parameters such as gap penalties. This paper explores the range of properties resulting from posing the alignment problem more as a 'mapping gaps in sequences' exercise. We argue that this approach is intuitive and provides greater control over the number of gaps placed within an alignment. This type of approach was proposed by Sankoff (1972), but unfortunately has not received much attention. We report and discuss our findings by comparing this approach to other techniques using structurally confirmed aligned sequences from a benchmark alignment database. Interestingly, this approach consistently provides optimal and near optimal alignments and is thus a viable approach to sequence alignment.

Publication types

  • Comparative Study
  • Evaluation Study
  • Validation Study

MeSH terms

  • Algorithms*
  • Gene Expression Profiling / methods*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Sequence Alignment / methods*
  • Sequence Analysis / methods*
  • Sequence Homology*