Exploring the shared gene signatures of smoking-related osteoporosis and chronic obstructive pulmonary disease using machine learning algorithms

Front Mol Biosci. 2023 May 11:10:1204031. doi: 10.3389/fmolb.2023.1204031. eCollection 2023.

Abstract

Objectives: Cigarette smoking has been recognized as a predisposing factor for both osteoporosis (OP) and chronic obstructive pulmonary disease (COPD). This study aimed to investigate the shared gene signatures affected by cigarette smoking in OP and COPD through gene expression profiling. Materials and methods: Microarray datasets (GSE11784, GSE13850, GSE10006, and GSE103174) were obtained from Gene Expression Omnibus (GEO) and analyzed for differentially expressed genes (DEGs) and weighted gene co-expression network analysis (WGCNA). Least absolute shrinkage and selection operator (LASSO) regression method and a random forest (RF) machine learning algorithm were used to identify candidate biomarkers. The diagnostic value of the method was assessed using logistic regression and receiver operating characteristic (ROC) curve analysis. Finally, immune cell infiltration was analyzed to identify dysregulated immune cells in cigarette smoking-induced COPD. Results: In the smoking-related OP and COPD datasets, 2858 and 280 DEGs were identified, respectively. WGCNA revealed 982 genes strongly correlated with smoking-related OP, of which 32 overlapped with the hub genes of COPD. Gene Ontology (GO) enrichment analysis showed that the overlapping genes were enriched in the immune system category. Using LASSO regression and RF machine learning, six candidate genes were identified, and a logistic regression model was constructed, which had high diagnostic values for both the training set and external validation datasets. The area under the curves (AUCs) were 0.83 and 0.99, respectively. Immune cell infiltration analysis revealed dysregulation in several immune cells, and six immune-associated genes were identified for smoking-related OP and COPD, namely, mucosa-associated lymphoid tissue lymphoma translocation protein 1 (MALT1), tissue-type plasminogen activator (PLAT), sodium channel 1 subunit alpha (SCNN1A), sine oculis homeobox 3 (SIX3), sperm-associated antigen 9 (SPAG9), and vacuolar protein sorting 35 (VPS35). Conclusion: The findings suggest that immune cell infiltration profiles play a significant role in the shared pathogenesis of smoking-related OP and COPD. The results could provide valuable insights for developing novel therapeutic strategies for managing these disorders, as well as shedding light on their pathogenesis.

Keywords: bioinformatic analysis; chronic obstructive pulmonary disease; cigarette smoking; immune infiltration; machine learning algorithm; osteoporosis.

Grants and funding

This study was supported by the National Natural Science Foundation of China (grant numbers: 81873320 and 82274546), the Scientific research project of Wuxi Municipal Health Commission (grant number: Q202232), and the Jiangsu Graduate Research and Practice Innovation Program (grant number: KYCX21_1698).