Designing XML schemas for bioinformatics

Biotechniques. 2003 Jun;34(6):1200-2, 1204, 1206 passim. doi: 10.2144/03346st03.

Abstract

Data interchange bioinformatics databases will, in the future, most likely take place using extensible markup language (XML). The document structure will be described by an XML Schema rather than a document type definition (DTD). To ensure flexibility, the XML Schema must incorporate aspects of Object-Oriented Modeling. This impinges on the choice of the data model, which, in turn, is based on the organization of bioinformatics data by biologists. Thus, there is a need for the general bioinformatics community to be aware of the design issues relating to XML Schema. This paper, which is aimed at a general bioinformatics audience, uses examples to describe the differences between a DTD and an XML Schema and indicates how Unified Modeling Language diagrams may be used to incorporate Object-Oriented Modeling in the design of schema.

MeSH terms

  • Biotechnology
  • Computational Biology / statistics & numerical data*
  • Databases, Genetic
  • Programming Languages*