Self-reporting data assets and their representation in the pharmaceutical industry

Drug Discov Today. 2022 Jan;27(1):207-214. doi: 10.1016/j.drudis.2021.07.019. Epub 2021 Jul 28.

Abstract

Standardizing data is crucial for preserving and exchanging scientific information. In particular, recording the context in which data were created ensures that information remains findable, accessible, interoperable, and reusable. Here, we introduce the concept of self-reporting data assets (SRDAs), which preserve data and contextual information. SRDAs are an abstract concept, which requires a suitable data format for implementation. Four promising data formats or languages are popularly used to represent data in pharma: JCAMP-DX, JSON, AnIML, and, more recently, the Allotrope Data Format (ADF). Here, we evaluate these four options in common use cases within the pharmaceutical industry using multiple criteria. The evaluation shows that ADF is the most suitable format for the implementation of SRDAs.

Keywords: ADF; AnIML; Data format; Data model; FAIR data; JCAMP-DX; JSON; Ontology; Pharmaceutical data; Scientific data.

Publication types

  • Review

MeSH terms

  • Data Accuracy*
  • Data Curation* / methods
  • Data Curation* / standards
  • Diffusion of Innovation
  • Drug Industry* / methods
  • Drug Industry* / organization & administration
  • Humans
  • Information Dissemination / methods*
  • Proof of Concept Study
  • Reference Standards
  • Research Design / standards*
  • Technology, Pharmaceutical / methods