E-MSD: an integrated data resource for bioinformatics

S Velankar; P McNeil; V Mittard-Runte; A Suarez; D Barrell; R Apweiler; K Henrick

doi:10.1093/nar/gki058

E-MSD: an integrated data resource for bioinformatics

Nucleic Acids Res. 2005 Jan 1;33(Database issue):D262-5. doi: 10.1093/nar/gki058.

Authors

S Velankar¹, P McNeil, V Mittard-Runte, A Suarez, D Barrell, R Apweiler, K Henrick

Affiliation

¹ Macromolecular Structure Database Group (E-MSD), EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Abstract

The Macromolecular Structure Database (MSD) group (http://www.ebi.ac.uk/msd/) continues to enhance the quality and consistency of macromolecular structure data in the worldwide Protein Data Bank (wwPDB) and to work towards the integration of various bioinformatics data resources. One of the major obstacles to the improved integration of structural databases such as MSD and sequence databases like UniProt is the absence of up to date and well-maintained mapping between corresponding entries. We have worked closely with the UniProt group at the EBI to clean up the taxonomy and sequence cross-reference information in the MSD and UniProt databases. This information is vital for the reliable integration of the sequence family databases such as Pfam and Interpro with the structure-oriented databases of SCOP and CATH. This information has been made available to the eFamily group (http://www.efamily.org.uk/) and now forms the basis of the regular interchange of information between the member databases (MSD, UniProt, Pfam, Interpro, SCOP and CATH). This exchange of annotation information has enriched the structural information in the MSD database with annotation from wider sequence-oriented resources. This work was carried out under the 'Structure Integration with Function, Taxonomy and Sequences (SIFTS)' initiative (http://www.ebi.ac.uk/msd-srv/docs/sifts) in the MSD group.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Amino Acid Sequence
Computational Biology*
Databases, Protein*
Proteins / chemistry*
Proteins / classification
Systems Integration

Substances

Proteins