Identifying features of risk periods for suicide attempts using document frequency and language use in electronic health records

Front Psychiatry. 2023 Dec 11:14:1217649. doi: 10.3389/fpsyt.2023.1217649. eCollection 2023.

Abstract

Background: Individualising mental healthcare at times when a patient is most at risk of suicide involves shifting research emphasis from static risk factors to those that may be modifiable with interventions. Currently, risk assessment is based on a range of extensively reported stable risk factors, but critical to dynamic suicide risk assessment is an understanding of each individual patient's health trajectory over time. The use of electronic health records (EHRs) and analysis using machine learning has the potential to accelerate progress in developing early warning indicators.

Setting: EHR data from the South London and Maudsley NHS Foundation Trust (SLaM) which provides secondary mental healthcare for 1.8 million people living in four South London boroughs.

Objectives: To determine whether the time window proximal to a hospitalised suicide attempt can be discriminated from a distal period of lower risk by analysing the documentation and mental health clinical free text data from EHRs and (i) investigate whether the rate at which EHR documents are recorded per patient is associated with a suicide attempt; (ii) compare document-level word usage between documents proximal and distal to a suicide attempt; and (iii) compare n-gram frequency related to third-person pronoun use proximal and distal to a suicide attempt using machine learning.

Methods: The Clinical Record Interactive Search (CRIS) system allowed access to de-identified information from the EHRs. CRIS has been linked with Hospital Episode Statistics (HES) data for Admitted Patient Care. We analysed document and event data for patients who had at some point between 1 April 2006 and 31 March 2013 been hospitalised with a HES ICD-10 code related to attempted suicide (X60-X84; Y10-Y34; Y87.0/Y87.2).

Findings: n = 8,247 patients were identified to have made a hospitalised suicide attempt. Of these, n = 3,167 (39.8%) of patients had at least one document available in their EHR prior to their first suicide attempt. N = 1,424 (45.0%) of these patients had been "monitored" by mental healthcare services in the past 30 days. From 60 days prior to a first suicide attempt, there was a rapid increase in the monitoring level (document recording of the past 30 days) increasing from 35.1 to 45.0%. Documents containing words related to prescribed medications/drugs/overdose/poisoning/addiction had the highest odds of being a risk indicator used proximal to a suicide attempt (OR 1.88; precision 0.91 and recall 0.93), and documents with words citing a care plan were associated with the lowest risk for a suicide attempt (OR 0.22; precision 1.00 and recall 1.00). Function words, word sequence, and pronouns were most common in all three representations (uni-, bi-, and tri-gram).

Conclusion: EHR documentation frequency and language use can be used to distinguish periods distal from and proximal to a suicide attempt. However, in our study 55.0% of patients with documentation, prior to their first suicide attempt, did not have a record in the preceding 30 days, meaning that there are a high number who are not seen by services at their most vulnerable point.

Keywords: assessment; electronic health records; language; risk; suicide.

Grants and funding

This study utilised the Clinical Record Interactive Search (CRIS) platform funded and developed by the National Institute for Health and Care Research (NIHR) Maudsley Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London, and a joint infrastructure grant from Guy’s and St Thomas’ Charity and the Maudsley Charity (grant number BRC-2011-10035). RD and GG were funded by a Clinician Scientist Fellowship (project e-HOST-IT) from the Health Foundation in partnership with the Academy of Medical Sciences. RD was also funded by an MRC award (grant code: MR/S020365/1). SV was funded by the Swedish Research Council (2015-00359), Marie Skłodowska-Curie Actions, COFUND, Project INCA 600398. JD received support from an NIHR Clinician Scientist Fellowship (CS-2018-18-ST2-014) and Psychiatry Research Trust Peggy Pollak Research Fellowship in Developmental Psychiatry. AR is part-supported by Health Data Research UK, an initiative funded by UK Research and Innovation, the Department of Health and Social Care (England) and the devolved administrations, and leading medical research charities. RS is part-funded by an NIHR Senior Investigator Award and by the NIHR Applied Research Collaboration South London (NIHR ARC South London) at King’s College Hospital NHS Foundation Trust. This study represents independent research part-funded by the NIHR Maudsley Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London. The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care. For the purposes of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Accepted Author Manuscript version arising from this submission.