Automatic knowledge graph population with model-complete text comprehension for pre-clinical outcomes in the field of spinal cord injury

Hendrik Ter Horst; Nicole Brazda; Jessica Schira-Heinen; Julia Krebbers; Hans-Werner Müller; Philipp Cimiano

doi:10.1016/j.artmed.2023.102491

Automatic knowledge graph population with model-complete text comprehension for pre-clinical outcomes in the field of spinal cord injury

Artif Intell Med. 2023 Mar:137:102491. doi: 10.1016/j.artmed.2023.102491. Epub 2023 Jan 17.

Authors

Hendrik Ter Horst¹, Nicole Brazda², Jessica Schira-Heinen², Julia Krebbers², Hans-Werner Müller², Philipp Cimiano³

Affiliations

¹ CITEC, Bielefeld University, Inspiration 1, 33619 Bielefeld, Germany. Electronic address: hterhors@techfak.uni-bielefeld.de.
² Neurologische Klinik, Universitätsklinikum der Heinrich-Heine-Universität Düsseldorf, Moorenstr. 5 and Center for Neuronal Regeneration, Life Science Center Düsseldorf, Merowingerplatz 1a, 40225 Düsseldorf, Germany.
³ CITEC, Bielefeld University, Inspiration 1, 33619 Bielefeld, Germany.

PMID: 36868686
DOI: 10.1016/j.artmed.2023.102491

Abstract

The paradigm of evidence-based medicine requires that medical decisions are made on the basis of the best available knowledge published in the literature. Existing evidence is often summarized in the form of systematic reviews and/or meta-reviews and is rarely available in a structured form. Manual compilation and aggregation is costly, and conducting a systematic review represents a high effort. The need to aggregate evidence arises not only in the context of clinical trials, but is also important in the context of pre-clinical animal studies. In this context, evidence extraction is important to support translation of the most promising pre-clinical therapies into clinical trials or to optimize clinical trial design. Aiming at developing methods that facilitate the task of aggregating evidence published in pre-clinical studies, in this paper a new system is presented that automatically extracts structured knowledge from such publications and stores it in a so-called domain knowledge graph. The approach follows the paradigm of model-complete text comprehension by relying on guidance from a domain ontology creating a deep relational data-structure that reflects the main concepts, protocol, and key findings of studies. Focusing on the domain of spinal cord injuries, a single outcome of a pre-clinical study is described by up to 103 outcome parameters. Since the problem of extracting all these variables together is intractable, we propose a hierarchical architecture that incrementally predicts semantic sub-structures according to a given data model in a bottom-up fashion. At the heart of our approach is a statistical inference method that relies on conditional random fields to infer the most likely instance of the domain model given the text of a scientific publication as input. This approach allows modeling dependencies between the different variables describing a study in a semi-joint fashion. We present a comprehensive evaluation of our system to understand the extent to which our system can capture a study in the depth required to enable the generation of new knowledge. We conclude the article with a brief description of some applications of the populated knowledge graph and show the potential implications of our work for supporting evidence-based medicine.

Keywords: Deep knowledge graph population; Information extraction; Pre-clinical outcomes; Spinal cord injury; Structured prediction.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Animals
Comprehension*
Evidence-Based Medicine
Pattern Recognition, Automated
Spinal Cord Injuries*
Systematic Reviews as Topic