Combining single-cell RNA sequencing data and transcriptomic data to unravel potential mechanisms and signature genes of the progression of idiopathic pulmonary fibrosis to lung adenocarcinoma and predict therapeutic agents

Funct Integr Genomics. 2023 Nov 24;23(4):346. doi: 10.1007/s10142-023-01274-y.

Abstract

Patients with idiopathic pulmonary fibrosis (IPF) have a significantly higher prevalence of lung adenocarcinoma (LUAD) than normal subjects, although the underlying association is unclear. The raw data involved were obtained from the Gene Expression Omnibus (GEO) database. Differential expression analysis and weighted gene co-expression network analysis were used to screen for differentially expressed genes (DEGs) and modular signature genes (MSGs). Genes intersecting DEGs and MSGs were considered hub genes for IPF and LUAD. Machine learning algorithms were applied to capture epithelial cell-derived signature genes (EDSGs) shared. External cohort data were exploited to validate the robustness of EDSGs. Immunohistochemical staining and K-M plots were used to denote the prognostic value of EDSGs in LUAD. Based on EDSGs, we constructed a TF-gene-miRNA regulatory network. Molecular docking can validate the strength of action between candidate drugs and EDSGs. Epithelial cells, 650 DEGs, and 1773 MSGs were shared by IPF and LUAD. As for 379 hub genes, we performed pathway and functional enrichment analysis. By analyzing sc-RNA seq data, we identified 1234 marker genes of IPF epithelial cell-derived and 1481 of LUAD. And these genes shared 8 items with 379 hub genes. Through the machine learning algorithms, we further fished TRIM2, S100A14, CYP4B1, LMO7, and SFN. The ROC curves emphasized the significance of EDSGs in predicting the onset of LUAD and IPF. The TF-gene-miRNA network revealed regulatory relationships behind EDSGs. Finally, we predicted appropriate therapeutic agents. Our study preliminarily identified potential mechanisms between IPF and LUAD, which will inform subsequent studies.

Keywords: Bioinformatics technology; Epithelial cells; Idiopathic pulmonary fibrosis (IPF); Lung adenocarcinoma (LUAD); Machine algorithm learning; Molecular docking; Signature genes; Single-cell RNA sequencing (sc-RNA seq).

MeSH terms

  • Adenocarcinoma of Lung* / drug therapy
  • Adenocarcinoma of Lung* / genetics
  • Humans
  • Idiopathic Pulmonary Fibrosis* / drug therapy
  • Idiopathic Pulmonary Fibrosis* / genetics
  • Lung Neoplasms* / drug therapy
  • Lung Neoplasms* / genetics
  • MicroRNAs* / genetics
  • Molecular Docking Simulation
  • Sequence Analysis, RNA
  • Transcriptome

Substances

  • MicroRNAs