Early identification of persistent somatic symptoms in primary care: data-driven and theory-driven predictive modelling based on electronic medical records of Dutch general practices

BMJ Open. 2023 May 2;13(5):e066183. doi: 10.1136/bmjopen-2022-066183.

Abstract

Objective: The present study aimed to early identify patients with persistent somatic symptoms (PSS) in primary care by exploring routine care data-based approaches.

Design/setting: A cohort study based on routine primary care data from 76 general practices in the Netherlands was executed for predictive modelling.

Participants: Inclusion of 94 440 adult patients was based on: at least 7-year general practice enrolment, having more than one symptom/disease registration and >10 consultations.

Methods: Cases were selected based on the first PSS registration in 2017-2018. Candidate predictors were selected 2-5 years prior to PSS and categorised into data-driven approaches: symptoms/diseases, medications, referrals, sequential patterns and changing lab results; and theory-driven approaches: constructed factors based on literature and terminology in free text. Of these, 12 candidate predictor categories were formed and used to develop prediction models by cross-validated least absolute shrinkage and selection operator regression on 80% of the dataset. Derived models were internally validated on the remaining 20% of the dataset.

Results: All models had comparable predictive values (area under the receiver operating characteristic curves=0.70 to 0.72). Predictors are related to genital complaints, specific symptoms (eg, digestive, fatigue and mood), healthcare utilisation, and number of complaints. Most fruitful predictor categories are literature-based and medications. Predictors often had overlapping constructs, such as digestive symptoms (symptom/disease codes) and drugs for anti-constipation (medication codes), indicating that registration is inconsistent between general practitioners (GPs).

Conclusions: The findings indicate low to moderate diagnostic accuracy for early identification of PSS based on routine primary care data. Nonetheless, simple clinical decision rules based on structured symptom/disease or medication codes could possibly be an efficient way to support GPs in identifying patients at risk of PSS. A full data-based prediction currently appears to be hampered by inconsistent and missing registrations. Future research on predictive modelling of PSS using routine care data should focus on data enrichment or free-text mining to overcome inconsistent registrations and improve predictive accuracy.

Keywords: MENTAL HEALTH; PRIMARY CARE; STATISTICS & RESEARCH METHODS.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Cohort Studies
  • Electronic Health Records
  • General Practice*
  • Humans
  • Medically Unexplained Symptoms*
  • Primary Health Care