DeepCG: A cell graph model for predicting prognosis in lung adenocarcinoma

Int J Cancer. 2024 Jun 15;154(12):2151-2161. doi: 10.1002/ijc.34901. Epub 2024 Mar 1.

Abstract

Lung cancer is the first leading cause of cancer-related death in the United States, with lung adenocarcinoma as the major subtype accounting for 40% of all cases. To improve patient survival, image-based prognostic models were developed due to the ready availability of pathological images at diagnosis. However, the application of these models is hampered by two main challenges: the lack of publicly available image datasets with high-quality survival information and the poor interpretability of conventional convolutional neural network models. Here, we integrated matched transcriptomic and H&E staining data from TCGA (The Cancer Genome Atlas) to develop an image-based prognostic model, termed Deep-learning based Cell Graph (DeepCG) model. Instead of survival data, we used a gene signature to predict patient prognostic risks, which was then used as labels for training DeepCG. Importantly, by employing graph structures to capture cell patterns, DeepCG can provide cell-level interpretation, which was more biologically relevant than previous region-level insights. We validated the prognostic values of DeepCG in independent datasets and demonstrated its ability to identify prognostically informative cells in images.

Keywords: H&E; graph neural network; lung adenocarcinoma; prognosis.

MeSH terms

  • Adenocarcinoma of Lung* / pathology
  • Gene Expression Profiling
  • Humans
  • Lung Neoplasms* / pathology
  • Prognosis
  • Proportional Hazards Models