Gap mapping: a paradigm for aligning two sequences

Matthew Bellgard; Thomas Gamble; Mark Reynolds; Adam Hunter; Ed Trifonov; Ross Taplin

Gap mapping: a paradigm for aligning two sequences

Appl Bioinformatics. 2003;2(3 Suppl):S31-5.

Authors

Matthew Bellgard¹, Thomas Gamble, Mark Reynolds, Adam Hunter, Ed Trifonov, Ross Taplin

Affiliation

¹ Centre for Bioinformatics and Biological Computing, School of Information Technology, Murdoch University, WA, Australia. m.bellgard@murdoch.edu.au

PMID: 15130814

Abstract

Pairwise sequence alignment is one of the most essential tools in comparative genomic sequence analysis. It is used to compare the sequences of genes and proteins with the aim of inferring structural, functional and evolutionary relationships. However, current 'mainstream' alignment algorithms have optimisation criteria based primarily on computational efficiency using parameters such as gap penalties, which are not biologically motivated. In addition, current alignment algorithms such as the Smith and Waterman technique provide a single alignment that could be sensitive to rather arbitrary choices in parameters such as gap penalties. This paper explores the range of properties resulting from posing the alignment problem more as a 'mapping gaps in sequences' exercise. We argue that this approach is intuitive and provides greater control over the number of gaps placed within an alignment. This type of approach was proposed by Sankoff (1972), but unfortunately has not received much attention. We report and discuss our findings by comparing this approach to other techniques using structurally confirmed aligned sequences from a benchmark alignment database. Interestingly, this approach consistently provides optimal and near optimal alignments and is thus a viable approach to sequence alignment.

Publication types

Comparative Study
Evaluation Study
Validation Study

MeSH terms

Algorithms*
Gene Expression Profiling / methods*
Reproducibility of Results
Sensitivity and Specificity
Sequence Alignment / methods*
Sequence Analysis / methods*
Sequence Homology*