Big Data Readiness in Radiation Oncology: An Efficient Approach for Relabeling Radiation Therapy Structures With Their TG-263 Standard Name in Real-World Data Sets

Adv Radiat Oncol. 2018 Oct 12;4(1):191-200. doi: 10.1016/j.adro.2018.09.013. eCollection 2019 Jan-Mar.

Abstract

Purpose: To prepare for big data analyses on radiation therapy data, we developed Stature, a tool-supported approach for standardization of structure names in existing radiation therapy plans. We applied the widely endorsed nomenclature standard TG-263 as the mapping target and quantified the structure name inconsistency in 2 real-world data sets.

Methods and materials: The clinically relevant structures in the radiation therapy plans were identified by reference to randomized controlled trials. The Stature approach was used by clinicians to identify the synonyms for each relevant structure, which was then mapped to the corresponding TG-263 name. We applied Stature to standardize the structure names for 654 patients with prostate cancer (PCa) and 224 patients with head and neck squamous cell carcinoma (HNSCC) who received curative radiation therapy at our institution between 2007 and 2017. The accuracy of the Stature process was manually validated in a random sample from each cohort. For the HNSCC cohort we measured the resource requirements for Stature, and for the PCa cohort we demonstrated its impact on an example clinical analytics scenario.

Results: All but 1 synonym group ("Hydrogel") was mapped to the corresponding TG-263 name, resulting in a TG-263 relabel rate of 99% (8837 of 8925 structures). For the PCa cohort, Stature matched a total of 5969 structures. Of these, 5682 structures were exact matches (ie, following local naming convention), 284 were matched via a synonym, and 3 required manual matching. This original radiation therapy structure names therefore had a naming inconsistency rate of 4.81%. For the HNSCC cohort, Stature mapped a total of 2956 structures (2638 exact, 304 synonym, 14 manual; 10.76% inconsistency rate) and required 7.5 clinician hours. The clinician hours required were one-fifth of those that would be required for manual relabeling. The accuracy of Stature was 99.97% (PCa) and 99.61% (HNSCC).

Conclusions: The Stature approach was highly accurate and had significant resource efficiencies compared with manual curation.