Automated Mapping of Real-world Oncology Laboratory Data to LOINC

AMIA Annu Symp Proc. 2022 Feb 21:2021:611-620. eCollection 2021.

Abstract

In this study we seek to determine the efficacy of using automated mapping methods to reduce the manual mapping burden of laboratory data to LOINC(r) on a nationwide electronic health record derived oncology specific dataset. We developed novel encoding methodologies to vectorize free text lab data, and evaluated logistic regression, random forest, and knn machine learning classifiers. All machine learning models did significantly better than deterministic baseline algorithms. The best classifiers were random forest and were able to predict the correct LOINC code 94.5% of the time. Ensemble classifiers further increased accuracy, with the best ensemble classifier predicting the same code 80.5% of the time with an accuracy of 99%. We conclude that by using an automated laboratory mapping model we can both reduce manual mapping time, and increase quality of mappings, suggesting automated mapping is a viable tool in a real-world oncology dataset.

MeSH terms

  • Algorithms
  • Electronic Health Records
  • Humans
  • Laboratories
  • Logical Observation Identifiers Names and Codes*
  • Machine Learning*