Identification of disease-specific genes related to immune infiltration in nonalcoholic steatohepatitis using machine learning algorithms

Medicine (Baltimore). 2024 May 17;103(20):e38001. doi: 10.1097/MD.0000000000038001.

Abstract

To identify disease signature genes associated with immune infiltration in nonalcoholic steatohepatitis (NASH), we downloaded 2 publicly available gene expression profiles, GSE164760 and GSE37031, from the gene expression omnibus database. These profiles represent human NASH and control samples and were used for differential genes (DEGs) expression screening. Two machine learning methods, the Least Absolute Shrinkage and Selection Operator regression model and Support Vector Machine Recursive Feature Elimination, were used to identify candidate disease signature genes. The CIBERSORT deconvolution algorithm was employed to analyze the infiltration of 22 immune cell types in NASH. Additionally, we constructed a NASH cell model using HepG2 cells treated with oleic acid and free fatty acids. The construction of the cell model was verified using oil red O staining, and Western blotting was used to detect the protein expression of the disease signature genes in both control and model groups. As a result, a total of 262 DEGs were identified. These DEGs were primarily associated with metal ion transmembrane transporter activity, sodium ion transmembrane transporter protein activity, calcium ion, and neuroactive ligand-receptor interactions. FOS, IGFBP2, dual-specificity phosphatase 1 (DUSP1), and IKZF3 were identified as disease signature genes of NASH by the least absolute shrinkage and selection operator and Support Vector Machine Recursive Feature Elimination algorithms for DEGs analysis. The receiver operating characteristic curves showed that FOS, IGFBP2, DUSP1, and IKZF3 had good diagnostic value (area under receiver operating characteristic curve > 0.8). These findings were validated in the GSE89632 dataset and through cellular assays. Immunocyte infiltration analysis revealed that NASH was associated with CD8 T cells, CD4 T cells, follicular helper T cells, resting NK cells, eosinophils, regulatory T cells, and γδ T cells. The FOS, IGFBP2, DUSP1, and IKZF3 genes were specifically associated with follicular helper T cells. Lipid droplet aggregation significantly increased in HepG2 cells treated with oleic acid and free fatty acids, indicating successful construction of the cell model. In this model, the expression of FOS, IGFBP2, and DUSP1 was significantly decreased, while that of IKZF3 was significantly elevated (P < .01, P < .001) compared with the control group. Therefore, FOS, IGFBP2, DUSP1, and IKZF3 can be considered as disease signature genes associated with immune infiltration in NASH.

MeSH terms

  • Algorithms
  • Gene Expression Profiling / methods
  • Hep G2 Cells
  • Humans
  • Machine Learning*
  • Non-alcoholic Fatty Liver Disease* / genetics
  • Non-alcoholic Fatty Liver Disease* / immunology
  • Support Vector Machine
  • Transcriptome