A pipeline for evaluation of machine learning/AI models to quantify PD-L1 immunohistochemistry

Lab Invest. 2024 Apr 25:102070. doi: 10.1016/j.labinv.2024.102070. Online ahead of print.

Abstract

Immunohistochemistry (IHC) is used to guide treatment decisions in multiple cancer types. For treatment with checkpoint inhibitors, PD-L1 IHC is used as a companion diagnostic. However, the scoring of PD-L1 is complicated by its expression in cancer and immune cells. Separation of cancer and non-cancer regions is needed to calculate tumor proportion scores (TPS) of PD-L1, which is based on the percentage of PD-L1 positive cancer cells. Evaluation of PD-L1 expression requires highly experienced pathologists and is often challenging and time consuming. Here we used a multi-institutional cohort of 77 lung cancer cases stained centrally with the PD-L1 22C3 clone. We developed a four-step pipeline for measuring TPS that includes the co-registration of H&E, PD-L1 and negative control (NC) digital slides for exclusion of necrosis, segmentation of cancer regions and quantification of PD-L1+ cells. As cancer segmentation is a challenging step for TPS generation, we trained DeepLab V3 in the Visiopharm software package to outline cancer regions in PD-L1 and negative control (NC) images and evaluated the model performance by mean intersection over union (mIoU) against manual outlines. Only 14 cases were required to accomplish an mIoU of 0.82 for cancer segmentation in hematoxylin stained NC cases. For PD-L1 stained slides, a model trained on PD-L1 tiles augmented by registered NC tiles achieved an mIoU of 0.79. In segmented cancer regions from whole slide images, the digital TPS achieved an accuracy of 75% against the manual TPS scores from the pathology report. Major reasons for algorithmic inaccuracies include the inclusion of immune cells in cancer outlines and poor nuclear segmentation of cancer cells. Our transparent and stepwise approach and performance metrics can be applied to any IHC assay to provide pathologists with important insights when to apply and how to evaluate commercial automated IHC scoring systems.

Keywords: Cancer segmentation; Digital Pathology; PD-L1; TPS.