A Unified Machine Reading Comprehension Framework for Cohort Selection

Ying Xiong; Weihua Peng; Qingcai Chen; Zhengxing Huang; Buzhou Tang

doi:10.1109/JBHI.2021.3095478

A Unified Machine Reading Comprehension Framework for Cohort Selection

IEEE J Biomed Health Inform. 2022 Jan;26(1):379-387. doi: 10.1109/JBHI.2021.3095478. Epub 2022 Jan 17.

Authors

Ying Xiong, Weihua Peng, Qingcai Chen, Zhengxing Huang, Buzhou Tang

PMID: 34236972
DOI: 10.1109/JBHI.2021.3095478

Abstract

Cohort selection is an essential prerequisite for clinical research, determining whether an individual satisfies given selection criteria. Previous works for cohort selection usually treated each selection criterion independently and ignored not only the meaning of each selection criterion but the relations among cohort selection criteria. To solve the problems above, we propose a novel unified machine reading comprehension (MRC) framework. In this MRC framework, we design simple rules to generate questions for each criterion from cohort selection guidelines and treat clues extracted by trigger words from patients' medical records as passages. A series of state-of-the-art MRC models based on BiDAF, BIMPM, BERT, BioBERT, NCBI-BERT, and RoBERTa are deployed to determine which question and passage pairs match. We also introduce a cross-criterion attention mechanism on representations of question and passage pairs to model relations among cohort selection criteria. Results on two datasets, that is, the dataset of the 2018 National NLP Clinical Challenge (N2C2) for cohort selection and a dataset from the MIMIC-III dataset, show that our NCBI-BERT MRC model with cross-criterion attention mechanism achieves the highest micro-averaged F1-score of 0.9070 on the N2C2 dataset and 0.8353 on the MIMIC-III dataset. It is competitive to the best system that relies on a large number of rules defined by medical experts on the N2C2 dataset. Comparing these two models, we find that the NCBI-BERT MRC model mainly performs worse on mathematical logic criteria. When using rules instead of the NCBI-BERT MRC model on some criteria regarding mathematical logic on the N2C2 dataset, we obtain a new benchmark with an F1-score of 0.9163, indicating that it is easy to integrate rules into MRC models for improvement.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Cohort Studies
Comprehension*
Electronic Health Records*
Humans
Natural Language Processing
Patient Selection