Using clinical Natural Language Processing for health outcomes research: Overview and actionable suggestions for future advances

Sumithra Velupillai; Hanna Suominen; Maria Liakata; Angus Roberts; Anoop D Shah; Katherine Morley; David Osborn; Joseph Hayes; Robert Stewart; Johnny Downs; Wendy Chapman; Rina Dutta

doi:10.1016/j.jbi.2018.10.005

Using clinical Natural Language Processing for health outcomes research: Overview and actionable suggestions for future advances

J Biomed Inform. 2018 Dec:88:11-19. doi: 10.1016/j.jbi.2018.10.005. Epub 2018 Oct 24.

Authors

Sumithra Velupillai¹, Hanna Suominen², Maria Liakata³, Angus Roberts⁴, Anoop D Shah⁵, Katherine Morley⁶, David Osborn⁷, Joseph Hayes⁸, Robert Stewart⁹, Johnny Downs¹⁰, Wendy Chapman¹¹, Rina Dutta¹²

Affiliations

¹ Institute of Psychiatry, Psychology & Neuroscience, King's College London, UK; School of Electrical Engineering and Computer Science, KTH, Stockholm, Sweden. Electronic address: sumithra.velupillai@kcl.ac.uk.
² College of Engineering and Computer Science, The Australian National University, Data61/CSIRO, University of Canberra, Australia; University of Turku, Finland. Electronic address: Hanna.Suominen@anu.edu.au.
³ Department of Computer Science, University of Warwick/Alan Turing Institute, UK. Electronic address: m.liakata@warwick.ac.uk.
⁴ Institute of Psychiatry, Psychology & Neuroscience, King's College London, UK. Electronic address: angus.roberts@kcl.ac.uk.
⁵ Institute of Health Informatics, University College London, UK; University College London NHS Foundation Trust, London, UK. Electronic address: anoop@doctors.org.uk.
⁶ Institute of Psychiatry, Psychology & Neuroscience, King's College London, UK; Melbourne School of Population and Global Health, The University of Melbourne, Australia. Electronic address: katherine.morley@kcl.ac.uk.
⁷ Division of Psychiatry, University College London, UK; Camden and Islington NHS Foundation Trust, London, UK. Electronic address: davidd.osborn@ucl.ac.uk.
⁸ Division of Psychiatry, University College London, UK; Camden and Islington NHS Foundation Trust, London, UK. Electronic address: josephj.hayes@ucl.ac.uk.
⁹ Institute of Psychiatry, Psychology & Neuroscience, King's College London, UK; South London and Maudsley NHS Foundation Trust, London, UK. Electronic address: robert.stewart@kcl.ac.uk.
¹⁰ Institute of Psychiatry, Psychology & Neuroscience, King's College London, UK; South London and Maudsley NHS Foundation Trust, London, UK. Electronic address: johnny.downs@kcl.ac.uk.
¹¹ Department of Biomedical Informatics, University of Utah, United States. Electronic address: wendy.chapman@utah.edu.
¹² Institute of Psychiatry, Psychology & Neuroscience, King's College London, UK; South London and Maudsley NHS Foundation Trust, London, UK. Electronic address: rina.dutta@kcl.ac.uk.

Abstract

The importance of incorporating Natural Language Processing (NLP) methods in clinical informatics research has been increasingly recognized over the past years, and has led to transformative advances. Typically, clinical NLP systems are developed and evaluated on word, sentence, or document level annotations that model specific attributes and features, such as document content (e.g., patient status, or report type), document section types (e.g., current medications, past medical history, or discharge summary), named entities and concepts (e.g., diagnoses, symptoms, or treatments) or semantic attributes (e.g., negation, severity, or temporality). From a clinical perspective, on the other hand, research studies are typically modelled and evaluated on a patient- or population-level, such as predicting how a patient group might respond to specific treatments or patient monitoring over time. While some NLP tasks consider predictions at the individual or group user level, these tasks still constitute a minority. Owing to the discrepancy between scientific objectives of each field, and because of differences in methodological evaluation priorities, there is no clear alignment between these evaluation approaches. Here we provide a broad summary and outline of the challenging issues involved in defining appropriate intrinsic and extrinsic evaluation methods for NLP research that is to be used for clinical outcomes research, and vice versa. A particular focus is placed on mental health research, an area still relatively understudied by the clinical NLP research community, but where NLP methods are of notable relevance. Recent advances in clinical NLP method development have been significant, but we propose more emphasis needs to be placed on rigorous evaluation for the field to advance further. To enable this, we provide actionable suggestions, including a minimal protocol that could be used when reporting clinical NLP method development and its evaluation.

Keywords: Clinical informatics; Epidemiology; Evaluation; Information extraction; Mental Health Informatics; Natural Language Processing; Public Health; Text analytics.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Data Collection / methods
Electronic Health Records*
Humans
Medical Informatics / methods*
Medical Informatics / trends
Mental Disorders / therapy
Mental Health Services / organization & administration*
Natural Language Processing*
Outcome Assessment, Health Care
Reproducibility of Results
Semantics*

Abstract

Publication types

MeSH terms

Grants and funding