Data-driven identification of predictive risk biomarkers for subgroups of osteoarthritis using interpretable machine learning

Nat Commun. 2024 Apr 1;15(1):2817. doi: 10.1038/s41467-024-46663-4.

Abstract

Osteoarthritis (OA) is increasing in prevalence and has a severe impact on patients' lives. However, our understanding of biomarkers driving OA risk remains limited. We developed a model predicting the five-year risk of OA diagnosis, integrating retrospective clinical, lifestyle and biomarker data from the UK Biobank (19,120 patients with OA, ROC-AUC: 0.72, 95%CI (0.71-0.73)). Higher age, BMI and prescription of non-steroidal anti-inflammatory drugs contributed most to increased OA risk prediction ahead of diagnosis. We identified 14 subgroups of OA risk profiles. These subgroups were validated in an independent set of patients evaluating the 11-year OA risk, with 88% of patients being uniquely assigned to one of the 14 subgroups. Individual OA risk profiles were characterised by personalised biomarkers. Omics integration demonstrated the predictive importance of key OA genes and pathways (e.g., GDF5 and TGF-β signalling) and OA-specific biomarkers (e.g., CRTAC1 and COL9A1). In summary, this work identifies opportunities for personalised OA prevention and insights into its underlying pathogenesis.

MeSH terms

  • Anti-Inflammatory Agents, Non-Steroidal / therapeutic use
  • Biomarkers
  • Calcium-Binding Proteins
  • Humans
  • Machine Learning
  • Osteoarthritis* / diagnosis
  • Osteoarthritis* / drug therapy
  • Osteoarthritis* / genetics
  • Retrospective Studies

Substances

  • Biomarkers
  • Anti-Inflammatory Agents, Non-Steroidal
  • CRTAC1 protein, human
  • Calcium-Binding Proteins