Algorithm to Identify Systemic Cancer Therapy Treatment Using Structured Electronic Data

JCO Clin Cancer Inform. 2017 Nov:1:1-9. doi: 10.1200/CCI.17.00002.

Abstract

Purpose: With the shift in the majority of oncology clinical care in the United States from paper records to electronic health records, researchers need efficient and validated processes to obtain accurate data about the entire treatment history of patients diagnosed with cancer. The objective of this study was to develop and validate an algorithm that is agnostic to the source of data but that can identify specific regimens in the entire course of systemic therapy treatment for patients diagnosed with breast, colorectal, or lung cancer.

Methods: A cohort of patients with incident breast, colorectal, and lung cancer were randomly distributed into six groups. The algorithm was iteratively modified, and the performance was assessed until no additional modifications could be identified in the first three groups. The performance of the algorithm was confirmed in the three groups that remained.

Results: The final model produced ranges of sensitivity between 97.2% and 100% for first-course systemic therapy across all cancers, with a false-positive rate of 0%. The algorithm matched the exact number of courses and the exact regimens of systemic therapy agents as captured by infusion, pharmacy, and procedure electronic medical record data for all courses of therapy 88% to 100% of the time.

Conclusion: Use of our validated algorithm that characterizes entire courses of systemic therapy treatment in patients diagnosed with breast, colorectal, and lung cancer will allow researchers in a variety of settings to conduct comparative effectiveness studies related to the uptake, safety, outcomes, and costs associated with the use of both novel and standard regimens.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Combined Modality Therapy
  • Data Warehousing
  • Disease Management
  • Electronic Health Records / statistics & numerical data*
  • Female
  • Humans
  • Male
  • Neoplasms / diagnosis
  • Neoplasms / epidemiology*
  • Neoplasms / therapy
  • Registries
  • Reproducibility of Results
  • United States / epidemiology