Distinct molecular subtypes of systemic sclerosis and gene signature with diagnostic capability

Front Immunol. 2023 Oct 2:14:1257802. doi: 10.3389/fimmu.2023.1257802. eCollection 2023.

Abstract

Background: As Systemic Sclerosis (SSc) is a connective tissue ailment that impacts various bodily systems. The study aims to clarify the molecular subtypes of SSc, with the ultimate objective of establishing a diagnostic model that can inform clinical treatment decisions.

Methods: Five microarray datasets of SSc were retrieved from the GEO database. To eliminate batch effects, the combat algorithm was applied. Immune cell infiltration was evaluated using the xCell algorithm. The ConsensusClusterPlus algorithm was utilized to identify SSc subtypes. Limma was used to determine differential expression genes (DEGs). GSEA was used to determine pathway enrichment. A support vector machine (SVM), Random Forest(RF), Boruta and LASSO algorithm have been used to select the feature gene. Diagnostic models were developed using SVM, RF, and Logistic Regression (LR). A ROC curve was used to evaluate the performance of the model. The compound-gene relationship was obtained from the Comparative Toxicogenomics Database (CTD).

Results: The identification of three immune subtypes in SSc samples was based on the expression profiles of immune cells. The utilization of 19 key intersectional DEGs among subtypes facilitated the classification of SSc patients into three robust subtypes (gene_ClusterA-C). Gene_ClusterA exhibited significant enrichment of B cells, while gene_ClusterC showed significant enrichment of monocytes. Moderate activation of various immune cells was observed in gene_ClusterB. We identified 8 feature genes. The SVM model demonstrating superior diagnostic performance. Furthermore, correlation analysis revealed a robust association between the feature genes and immune cells. Eight pertinent compounds, namely methotrexate, resveratrol, paclitaxel, trichloroethylene, formaldehyde, silicon dioxide, benzene, and tetrachloroethylene, were identified from the CTD.

Conclusion: The present study has effectively devised an innovative molecular subtyping methodology for patients with SSc and a diagnostic model based on machine learning to aid in clinical treatment. The study has identified potential molecular targets for therapy, thereby offering novel perspectives for the treatment and investigation of SSc.

Keywords: diagnostic; immune microenvironment; molecular subtypes; systemic sclerosis; unsupervised machine learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • B-Lymphocytes
  • Benzene
  • Databases, Factual
  • Humans
  • Scleroderma, Systemic* / diagnosis
  • Scleroderma, Systemic* / genetics

Substances

  • Benzene

Grants and funding

The authors declare financial support was received for the research, authorship, and/or publication of this article. This work was supported by the National Social Science Fund of China (21BTQ050), the Key R&D Project of Shanxi Province (202102130501003) and Shanxi Key Laboratory of Big Data for Clinical Decision Research (2021D100012021515245001135236).