Non-redundant association rules between diseases and medications: an automated method for knowledge base construction

BMC Med Inform Decis Mak. 2015 Apr 15:15:29. doi: 10.1186/s12911-015-0151-9.

Abstract

Background: The widespread use of electronic health records (EHRs) has generated massive clinical data storage. Association rules mining is a feasible technique to convert this large amount of data into usable knowledge for clinical decision making, research or billing. We present a data driven method to create a knowledge base linking medications to pathological conditions through their therapeutic indications from elements within the EHRs.

Methods: Association rules were created from the data of patients hospitalised between May 2012 and May 2013 in the department of Cardiology at the University Hospital of Strasbourg. Medications were extracted from the medication list, and the pathological conditions were extracted from the discharge summaries using a natural language processing tool. Association rules were generated along with different interestingness measures: chi square, lift, conviction, dependency, novelty and satisfaction. All medication-disease pairs were compared to the Summary of Product Characteristics, which is the gold standard. A score based on the other interestingness measures was created to filter the best rules, and the indices were calculated for the different interestingness measures.

Results: After the evaluation against the gold standard, a list of accurate association rules was successfully retrieved. Dependency represents the best recall (0.76). Our score exhibited higher exactness (0.84) and precision (0.27) than all of the others interestingness measures. Further reductions in noise produced by this method must be performed to improve the classification precision.

Conclusions: Association rules mining using the unstructured elements of the EHR is a feasible technique to identify clinically accurate associations between medications and pathological conditions.

MeSH terms

  • Cardiovascular Diseases / drug therapy*
  • Data Mining / methods*
  • Electronic Health Records*
  • Humans
  • Knowledge Bases*
  • Natural Language Processing*