Integrated proteome sequencing, bulk RNA sequencing and single-cell RNA sequencing to identify potential biomarkers in different grades of intervertebral disc degeneration

Front Cell Dev Biol. 2023 Mar 16:11:1136777. doi: 10.3389/fcell.2023.1136777. eCollection 2023.

Abstract

Low back pain (LBP) is a prevalent health problem worldwide that affects over 80% of adults during their lifetime. Intervertebral disc degeneration (IDD) is a well-recognized leading cause of LBP. IDD is classified into five grades according to the Pfirrmann classification system. The purpose of this study was to identify potential biomarkers in different IDD grades through an integrated analysis of proteome sequencing (PRO-seq), bulk RNA sequencing (bRNA-seq) and single-cell RNA sequencing (scRNA-seq) data. Eight cases of grade I-IV IDD were obtained. Grades I and II were considered non-degenerative discs (relatively normal), whereas grades III and IV were considered degenerative discs. PRO-seq analysis was performed to identify differentially expressed proteins (DEPs) in various IDD grades. Variation analysis was performed on bRNA-seq data to differentiate expressed genes (DEGs) in normal and degenerated discs. In addition, scRNA-seq was performed to validate DEGs in degenerated and non-degenerated nucleus pulposus (NP). Machine learning (ML) algorithms were used to screen hub genes. The receiver operating characteristic (ROC) curve was used to validate the efficiency of the screened hub genes to predict IDD. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were performed to analyze function enrichment and signaling pathways. Protein-protein interaction (PPI) network was used to prioritize disease-related proteins. SERPINA1, ORM2, FGG and COL1A1 were identified through PRO-seq as the hub proteins involved in regulating IDD. ML algorithms selected ten hub genes, including IBSP, COL6A2, MMP2, SERPINA1, ACAN, FBLN7, LAMB2, TTLL7, COL9A3, and THBS4 in bRNA-seq. Since serine protease inhibitor clade A member 1 (SERPINA1) was the only common gene, its accuracy in degenerated and non-degenerated NP cells was validated using scRNA-seq. Then, the rat degeneration model of caudal vertebra was established. The expression of SERPINA1 and ORM2 was detected using immunohistochemical staining of human and rat intervertebral discs. The results showed that SERPINA1 was poorly expressed in the degenerative group. We further explored the potential function of SERPINA1 by Gene Set Enrichment Analysis (GSEA) and cell-cell communication. Therefore, SERPINA1 can be used as a biomarker to regulate or predict the progress of disc degeneration.

Keywords: bioinformatics; bulk RNA sequencing; intervertebral disc degeneration; machine learning; proteome sequencing; single-cell RNA sequencing.

Grants and funding

This study was supported by the Natural Science Foundation of Chongqing (grant number cstc2021jcyj-msxmX0134).