Deep learning with weak annotation from diagnosis reports for detection of multiple head disorders: a prospective, multicentre study

Yuchen Guo; Yuwei He; Jinhao Lyu; Zhanping Zhou; Dong Yang; Liangdi Ma; Hao-Tian Tan; Changjian Chen; Wei Zhang; Jianxing Hu; Dongshan Han; Guiguang Ding; Shixia Liu; Hui Qiao; Feng Xu; Xin Lou; Qionghai Dai

doi:10.1016/S2589-7500(22)00090-5

Deep learning with weak annotation from diagnosis reports for detection of multiple head disorders: a prospective, multicentre study

Lancet Digit Health. 2022 Aug;4(8):e584-e593. doi: 10.1016/S2589-7500(22)00090-5. Epub 2022 Jun 17.

Authors

Yuchen Guo¹, Yuwei He², Jinhao Lyu³, Zhanping Zhou², Dong Yang², Liangdi Ma², Hao-Tian Tan², Changjian Chen², Wei Zhang⁴, Jianxing Hu³, Dongshan Han³, Guiguang Ding², Shixia Liu², Hui Qiao⁵, Feng Xu⁶, Xin Lou⁷, Qionghai Dai⁸

Affiliations

¹ Institute for Brain and Cognitive Sciences, BNRist, Tsinghua University, Beijing, China.
² Institute for Brain and Cognitive Sciences, BNRist, Tsinghua University, Beijing, China; School of Software, Tsinghua University, Beijing, China.
³ Department of Radiology, Chinese PLA General Hospital, Beijing, China.
⁴ Department of Radiology, Brain Hospital of Hunan Province, Hunan, China.
⁵ Institute for Brain and Cognitive Sciences, BNRist, Tsinghua University, Beijing, China; Department of Automation, BLBCI, Tsinghua University, Beijing, China.
⁶ Institute for Brain and Cognitive Sciences, BNRist, Tsinghua University, Beijing, China; School of Software, Tsinghua University, Beijing, China. Electronic address: feng-xu@tsinghua.edu.cn.
⁷ Department of Radiology, Chinese PLA General Hospital, Beijing, China. Electronic address: louxin@301hospital.com.cn.
⁸ Institute for Brain and Cognitive Sciences, BNRist, Tsinghua University, Beijing, China; Department of Automation, BLBCI, Tsinghua University, Beijing, China. Electronic address: daiqh@tsinghua.edu.cn.

PMID: 35725824
DOI: 10.1016/S2589-7500(22)00090-5

Abstract

Background: A large training dataset with high-quality annotations is necessary for building an accurate and generalisable deep learning system, which can be difficult and expensive to prepare in medical applications. We present a novel deep-learning-based system, requiring no annotator but weak annotation from a diagnosis report, for accurate and generalisable performance in detecting multiple head disorders from CT scans, including ischaemia, haemorrhage, tumours, and skull fractures.

Methods: Our system was developed on 104 597 head CT scans from the Chinese PLA General Hospital, with associated textual diagnosis reports. Without expert annotation, we used keyword matching on the reports to automatically generate disorder labels for each scan. The labels were inaccurate because of the unreliable annotator-free strategy and inexact because of scan-level annotation. We proposed RoLo, a novel weakly supervised learning algorithm, with a noise-tolerant mechanism and a multi-instance learning strategy to address these issues. RoLo was tested on retrospective (2357 scans from the Chinese PLA General Hospital), prospective (650 scans from the Chinese PLA General Hospital), cross-centre (1525 scans from the Brain Hospital of Hunan Province), cross-equipment (1484 scans from the Chinese PLA General Hospital), and cross-nation (CQ500 public dataset from India) test datasets. Four radiologists were tested on the prospective test dataset before and after viewing system recommendations to assess whether the system could improve diagnostic performance.

Findings: The area under the receiver operating characteristic curve for detecting the four disorder types was 0·976 (95% CI 0·976-0·976) for retrospective, 0·975 (0·974-0·976) for prospective, 0·965 (0·964-0·966) for cross-centre, and 0·971 (0·971-0·972) for cross-equipment test datasets, and 0·964 (0·964-0·966) for CQ500 (with only haemorrhage and fracture). The system achieved similar performance to four radiologists and helped to improve sensitivity and specificity by 0·109 (95% CI 0·086-0·131) and 0·022 (0·017-0·026), respectively.

Interpretation: Without expert annotated data, our system achieved accurate and generalisable performance for head disorder detection. The system improved the diagnostic performance of radiologists. Because of its accuracy and generalisability, our computer-aided diganostic system could be used in clinical practice to improve the accuracy and efficiency of radiologists in different hospitals.

Funding: National Key R&D Program of China, National Natural Science Foundation of China, and Beijing Natural Science Foundation.

Publication types

Multicenter Study
Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Deep Learning*
Polyesters
Prospective Studies
Retrospective Studies

Substances

Polyesters