Background and aims: Ulcerative colitis [UC] is a complex heterogeneous disease. This study aims to reveal the underlying molecular features of UC using genome-scale transcriptomes of patients with UC, and to develop and validate a novel stratification scheme.
Methods: A normalised compendium was created using colon tissue samples (455 patients with UC and 147 healthy controls [HCs]), covering genes from 10 microarray datasets. Upregulated differentially expressed genes [DEGs] were subjected to functional network analysis, wherein samples were grouped using unsupervised clustering. Additionally, the robustness of subclustering was further assessed by two RNA sequencing datasets [100 patients with UC and 16 HCs]. Finally, the Xgboost classifier was applied to the independent datasets to evaluate the efficacy of different biologics in patients with UC.
Results: Based on 267 upregulated DEGs of the transcript profiles, UC patients were classified into three subtypes [subtypes A-C] with distinct molecular and cellular signatures. Epithelial activation-related pathways were significantly enriched in subtype A [named epithelial proliferation], whereas subtype C was characterised as the immune activation subtype with prominent immune cells and proinflammatory signatures. Subtype B [named mixed] was modestly activated in all the signalling pathways. Notably, subtype A showed a stronger association with the superior response of biologics such as golimumab, infliximab, vedolizumab, and ustekinumab compared with subtype C.
Conclusions: We conducted a deep stratification of mucosal tissue using the most comprehensive microarray and RNA sequencing data, providing critical insights into pathophysiological features of UC, which could serve as a template for stratified treatment approaches.
Keywords: Machine learning; ulcerative colitis; unsupervised clustering.
© The Author(s) 2023. Published by Oxford University Press on behalf of European Crohn’s and Colitis Organisation.