A novel tool for standardizing clinical data in a semantically rich model

J Biomed Inform. 2020:112S:100086. doi: 10.1016/j.yjbinx.2020.100086. Epub 2020 Sep 19.

Abstract

Standardizing clinical information in a semantically rich data model is useful for promoting interoperability and facilitating high quality research. Semantic Web technologies such as Resource Description Framework can be utilized to their full potential when a model accurately reflects the semantics of the clinical situation it describes. To this end, ontologies that abide by sound organizational principles can be used as the building blocks of a semantically rich model for the storage of clinical data. However, it is a challenge to programmatically define such a model and load data from disparate sources. The PennTURBO Semantic Engine is a tool developed at the University of Pennsylvania that transforms concise RDF data into a source-independent, semantically rich model. This system sources classes from an application ontology and specifically defines how instances of those classes may relate to each other. Additionally, the system defines and executes RDF data transformations by launching dynamically generated SPARQL update statements. The Semantic Engine was designed as a generalizable data standardization tool, and is able to work with various data models and incoming data sources. Its human-readable configuration files can easily be shared between institutions, providing the basis for collaboration on a standard data model.

Keywords: Biomedical ontologies; Clinical data; Common data model; Data interoperability; Semantic Web technologies.