Analysis of potential genetic biomarkers and molecular mechanism of smoking-related postmenopausal osteoporosis using weighted gene co-expression network analysis and machine learning

PLoS One. 2021 Sep 23;16(9):e0257343. doi: 10.1371/journal.pone.0257343. eCollection 2021.

Abstract

Objectives: Smoking is a significant independent risk factor for postmenopausal osteoporosis, leading to genome variations in postmenopausal smokers. This study investigates potential biomarkers and molecular mechanisms of smoking-related postmenopausal osteoporosis (SRPO).

Materials and methods: The GSE13850 microarray dataset was downloaded from Gene Expression Omnibus (GEO). Gene modules associated with SRPO were identified using weighted gene co-expression network analysis (WGCNA), protein-protein interaction (PPI) analysis, and pathway and functional enrichment analyses. Feature genes were selected using two machine learning methods: support vector machine-recursive feature elimination (SVM-RFE) and random forest (RF). The diagnostic efficiency of the selected genes was assessed by gene expression analysis and receiver operating characteristic curve.

Results: Eight highly conserved modules were detected in the WGCNA network, and the genes in the module that was strongly correlated with SRPO were used for constructing the PPI network. A total of 113 hub genes were identified in the core network using topological network analysis. Enrichment analysis results showed that hub genes were closely associated with the regulation of RNA transcription and translation, ATPase activity, and immune-related signaling. Six genes (HNRNPC, PFDN2, PSMC5, RPS16, TCEB2, and UBE2V2) were selected as genetic biomarkers for SRPO by integrating the feature selection of SVM-RFE and RF.

Conclusion: The present study identified potential genetic biomarkers and provided a novel insight into the underlying molecular mechanism of SRPO.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Biomarkers, Tumor / genetics
  • Computational Biology / methods
  • Databases, Genetic
  • Female
  • Gene Expression Profiling
  • Gene Regulatory Networks*
  • Genetic Markers*
  • Humans
  • Machine Learning
  • Microarray Analysis
  • Oligonucleotide Array Sequence Analysis
  • Osteoporosis, Postmenopausal / complications*
  • Osteoporosis, Postmenopausal / genetics*
  • Protein Interaction Mapping
  • Protein Interaction Maps
  • ROC Curve
  • Support Vector Machine
  • Tobacco Use Disorder / complications*

Substances

  • Biomarkers, Tumor
  • Genetic Markers

Grants and funding

This study was supported by the National Natural Science Foundation of China (No. 81873320, 81973878). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.