Exploring Opportunities for Clinical Data Warehouse Enhancement Through Data Catalog Integration

Stud Health Technol Inform. 2024 Apr 26:313:198-202. doi: 10.3233/SHTI240037.

Abstract

Secondary use of clinical health data implies a prior integration of mostly heterogenous and multidimensional data sets. A clinical data warehouse addresses the technological and organizational framework conditions required for this, by making any data available for analysis. However, users of a data warehouse often do not have a comprehensive overview of all available data and only know about their own data in their own systems - a situation which is also referred to as 'data siloed state'. This problem can be addressed and ultimately solved by implementation of a data catalog. Its core function is a search engine, which allows for searching the metadata collected from different data sources and thereby accessing all data there is. With this in mind, we conducted an explorative online market survey followed by vendor comparison as a pre-requisite for system selection of a data catalog. Assessment of vendor performance was based on seven predetermined and weighted selection criteria. Although three vendors achieved the highest score, results were lying closely together. Detailed investigations and test installations are needed for further narrowing down the selection process.

Keywords: Clinical Data Warehouse; Data Catalog; Health data; Metadata management; Secondary use.

MeSH terms

  • Data Warehousing*
  • Electronic Health Records
  • Humans
  • Information Storage and Retrieval / methods
  • Metadata
  • Search Engine