Validation of the Crystallography Open Database using the Crystallographic Information Framework

J Appl Crystallogr. 2021 Feb 14;54(Pt 2):661-672. doi: 10.1107/S1600576720016532. eCollection 2021 Apr 1.

Abstract

Data curation practices of the Crystallography Open Database (COD) are described with additional focus being placed on the formal validation using the Crystallographic Information Framework (CIF). The cif_validate program, capable of validating CIF files against both the DDL1 and the DDLm dictionaries, is presented and used to process the entirety of the COD. Validation results collected from over 450 000 CIF files are demonstrated to be a useful resource in the data maintenance process as well as the development of the underlying ontologies. A set of programs intended to aid in the dictionary migration from DDL1 to DDLm is also presented.

Keywords: CIF dictionary; CIF validation; Crystallographic Information Framework; Crystallography Open Database; DDLm.

Grants and funding

This work was funded by Research Council of Lithuania grant MIP-20-21.