MagicMatch--cross-referencing sequence identifiers across databases

Bioinformatics. 2005 Aug 15;21(16):3429-30. doi: 10.1093/bioinformatics/bti548. Epub 2005 Jun 16.

Abstract

Motivation: At present, mapping of sequence identifiers across databases is a daunting, time-consuming and computationally expensive process, usually achieved by sequence similarity searches with strict threshold values.

Summary: We present a rapid and efficient method to map sequence identifiers across databases. The method uses the MD5 checksum algorithm for message integrity to generate sequence fingerprints and uses these fingerprints as hash strings to map sequences across databases. The program, called MagicMatch, is able to cross-link any of the major sequence databases within a few seconds on a modest desktop computer.

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Database Management Systems*
  • Databases, Protein*
  • Information Storage and Retrieval / methods*
  • Molecular Sequence Data
  • Proteins / analysis
  • Proteins / chemistry*
  • Proteins / classification
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods*
  • Software*

Substances

  • Proteins