SNAD: Sequence Name Annotation-based Designer

Igor A Sidorov; Denis A Reshetov; Alexander E Gorbalenya

doi:10.1186/1471-2105-10-251

SNAD: Sequence Name Annotation-based Designer

BMC Bioinformatics. 2009 Aug 14:10:251. doi: 10.1186/1471-2105-10-251.

Authors

Igor A Sidorov¹, Denis A Reshetov, Alexander E Gorbalenya

Affiliation

¹ Molecular Virology Laboratory, Department of Medical Microbiology, Center of Infectious Diseases, Leiden University Medical Center, Leiden, Netherlands. i.a.sidorov@lumc.nl

Abstract

Background: A growing diversity of biological data is tagged with unique identifiers (UIDs) associated with polynucleotides and proteins to ensure efficient computer-mediated data storage, maintenance, and processing. These identifiers, which are not informative for most people, are often substituted by biologically meaningful names in various presentations to facilitate utilization and dissemination of sequence-based knowledge. This substitution is commonly done manually that may be a tedious exercise prone to mistakes and omissions.

Results: Here we introduce SNAD (Sequence Name Annotation-based Designer) that mediates automatic conversion of sequence UIDs (associated with multiple alignment or phylogenetic tree, or supplied as plain text list) into biologically meaningful names and acronyms. This conversion is directed by precompiled or user-defined templates that exploit wealth of annotation available in cognate entries of external databases. Using examples, we demonstrate how this tool can be used to generate names for practical purposes, particularly in virology.

Conclusion: A tool for controllable annotation-based conversion of sequence UIDs into biologically meaningful names and acronyms has been developed and placed into service, fostering links between quality of sequence annotation, and efficiency of communication and knowledge dissemination among researchers.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Computational Biology / methods*
Databases, Factual
Internet
Sequence Analysis*
Sequence Analysis, DNA / methods
Software*
Terminology as Topic
User-Computer Interface