ANDY: a general, fault-tolerant tool for database searching on computer clusters

Bioinformatics. 2006 Mar 1;22(5):618-20. doi: 10.1093/bioinformatics/btk020. Epub 2006 Jan 5.

Abstract

Summary: ANDY (seArch coordination aND analYsis) is a set of Perl programs and modules for distributing large biological database searches, and in general any sequence of commands, across the nodes of a Linux computer cluster. ANDY is compatible with several commonly used distributed resource management (DRM) systems, and it can be easily extended to new DRMs. A distinctive feature of ANDY is the choice of either dedicated or fair-use operation: ANDY is almost as efficient as single-purpose tools that require a dedicated cluster, but it runs on a general-purpose cluster along with any other jobs scheduled by a DRM. Other features include communication through named pipes for performance, flexible customizable routines for error-checking and summarizing results, and multiple fault-tolerance mechanisms.

Availability: ANDY is freely available and can be obtained from http://compbio.berkeley.edu/proj/andy.

Supplementary information: Supplemental data, figures, and a more detailed overview of the software are found at http://compbio.berkeley.edu/proj/andy.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Computing Methodologies
  • Database Management Systems*
  • Databases, Genetic*
  • Information Storage and Retrieval / methods*
  • Internet*
  • Online Systems*
  • Sequence Alignment / methods*
  • Sequence Analysis / methods*