A methodology to extract outcomes from routine healthcare data for patients with locally advanced non-small cell lung cancer

BMC Health Serv Res. 2018 Apr 11;18(1):278. doi: 10.1186/s12913-018-3029-6.

Abstract

Background: Outcomes for patients in UK with locally advanced non-small cell lung cancer (LA NSCLC) are amongst the worst in Europe. Assessing outcomes is important for analysing the effectiveness of current practice. However, data quality is inconsistent and regular large scale analysis is challenging. This project investigates the use of routine healthcare datasets to determine progression free survival (PFS) and overall survival (OS) of patients treated with primary radical radiotherapy for LA NSCLC.

Methods: All LA NSCLC patients treated with primary radical radiotherapy in a 2 year period were identified and paired manual and routine data generated for an initial pilot study. Manual data was extracted information from hospital records and considered the gold standard. Key time points were date of diagnosis, recurrence, death or last clinical encounter. Routine data was collected from various data sources including, Hospital Episode Statistics, Personal Demographic Service, chemotherapy data, and radiotherapy datasets. Relevant event dates were defined by proxy time points and refined using backdating and time interval optimization. Dataset correlations were then tested on key clinical outcome indicators to establish if routine data could be used as a reliable proxy measure for manual data.

Results: Forty-three patients were identified for the pilot study. The manual data showed a median age of 67 years (range 46- 89 years) and all patients had stage IIIA/B disease. Using the manual data, the median PFS was 10.78 months (range 1.58-37.49 months) and median OS was 16.36 months (range 2.69-37.49 months). Based on routine data, using proxy measures, the estimated median PFS was 10.68 months (range 1.61-31.93 months) and estimated median OS was 15.38 months (range 2.14-33.71 months). Overall, the routine data underestimated the PFS and OS of the manual data but there was good correlation with a Pearson correlation coefficient of 0.94 for PFS and 0.97 for OS.

Conclusions: This is a novel approach to use routine datasets to determine outcome indicators in patients with LA NSCLC that will be a surrogate to analysing manual data. The ability to enable efficient and large scale analysis of current lung cancer strategies has a huge potential impact on the healthcare system.

Keywords: LA NSCLC; Outcomes; Routine datasets.

MeSH terms

  • Aged
  • Aged, 80 and over
  • Carcinoma, Non-Small-Cell Lung / mortality*
  • Carcinoma, Non-Small-Cell Lung / therapy
  • Disease-Free Survival
  • Europe / epidemiology
  • Female
  • Humans
  • Lung Neoplasms / mortality*
  • Lung Neoplasms / therapy
  • Male
  • Middle Aged
  • Neoplasm Recurrence, Local / mortality
  • Outcome Assessment, Health Care
  • Pilot Projects
  • Prognosis
  • Prospective Studies
  • United Kingdom / epidemiology