JEnsembl: a version-aware Java API to Ensembl data systems

Bioinformatics. 2012 Nov 1;28(21):2724-31. doi: 10.1093/bioinformatics/bts525. Epub 2012 Sep 3.

Abstract

Motivation: The Ensembl Project provides release-specific Perl APIs for efficient high-level programmatic access to data stored in various Ensembl database schema. Although Perl scripts are perfectly suited for processing large volumes of text-based data, Perl is not ideal for developing large-scale software applications nor embedding in graphical interfaces. The provision of a novel Java API would facilitate type-safe, modular, object-orientated development of new Bioinformatics tools with which to access, analyse and visualize Ensembl data.

Results: The JEnsembl API implementation provides basic data retrieval and manipulation functionality from the Core, Compara and Variation databases for all species in Ensembl and EnsemblGenomes and is a platform for the development of a richer API to Ensembl datasources. The JEnsembl architecture uses a text-based configuration module to provide evolving, versioned mappings from database schema to code objects. A single installation of the JEnsembl API can therefore simultaneously and transparently connect to current and previous database instances (such as those in the public archive) thus facilitating better analysis repeatability and allowing 'through time' comparative analyses to be performed.

Availability: Project development, released code libraries, Maven repository and documentation are hosted at SourceForge (http://jensembl.sourceforge.net).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology
  • Databases, Factual*
  • Genomic Library*
  • Indonesia
  • Information Storage and Retrieval / methods*
  • Software*