Employing complex polyhierarchical ontologies and promoting interoperability of i2b2 data systems

AMIA Annu Symp Proc. 2015 Nov 5:2015:359-65. eCollection 2015.

Abstract

I2b2 is in widespread use for managing research data warehouses. It employs reference ontologies as a record index and supports searching for aggregate cases using a pattern match operator on ASCII strings representing the node traversal from root to concept(PATHs). This creates complexities in dissemination and deployment for large polyhierarchical ontologies such as SNOMED CT. We hypothesized that an alternative approach employing transitive closure tables (TC) could lead to more accurate, efficient and interoperable search tools for i2b2. We evaluated search speed, accuracy and interoperability of queries employing each approach. We found both TC-based and PATH-based queries to produce accurate results. However, we observed that TC-based queries involving concepts included in large numbers of paths ran substantially faster than PATH-based queries for the same concept. Oracle query plan resource estimates differed by one to three orders of magnitude for these queries. We conclude that a simplification of dissemination tools for SNOMED CT and revision in the metadata build for i2b2 can effectively employ SNOMED CT with increased efficiency and comparable accuracy. Use of transitive closure tables in metadata can promote network query interoperability.

MeSH terms

  • Algorithms
  • Data Mining / methods*
  • Decision Making, Computer-Assisted*
  • Electronic Health Records / organization & administration*
  • Information Systems / organization & administration*
  • Medical Informatics
  • Metadata
  • Search Engine / methods*
  • Systematized Nomenclature of Medicine*