Inter-patient distance metrics using SNOMED CT defining relationships

J Biomed Inform. 2006 Dec;39(6):697-705. doi: 10.1016/j.jbi.2006.01.004. Epub 2006 Feb 24.

Abstract

Background: Patient-based similarity metrics are important case-based reasoning tools which may assist with research and patient care applications. Ontology and information content principles may be potentially helpful tools for similarity metric development.

Methods: Patient cases from 1989 through 2003 from the Columbia University Medical Center data repository were converted to SNOMED CT concepts. Five metrics were implemented: (1) percent disagreement with data as an unstructured "bag of findings," (2) average links between concepts, (3) links weighted by information content with descendants, (4) links weighted by information content with term prevalence, and (5) path distance using descendants weighted by information content with descendants. Three physicians served as gold standard for 30 cases.

Results: Expert inter-rater reliability was 0.91, with rank correlations between 0.61 and 0.81, representing upper-bound performance. Expert performance compared to metrics resulted in correlations of 0.27, 0.29, 0.30, 0.30, and 0.30, respectively. Using SNOMED axis Clinical Findings alone increased correlation to 0.37.

Conclusion: Ontology principles and information content provide useful information for similarity metrics but currently fall short of expert performance.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Clinical Medicine / classification*
  • Humans
  • Information Storage and Retrieval
  • Medical Records Systems, Computerized / classification*
  • Models, Statistical
  • Models, Theoretical
  • Natural Language Processing
  • Systematized Nomenclature of Medicine*
  • Systems Integration
  • Terminology as Topic
  • Unified Medical Language System
  • Vocabulary, Controlled