Relay learning: a physically secure framework for clinical multi-site deep learning

Zi-Hao Bo; Yuchen Guo; Jinhao Lyu; Hengrui Liang; Jianxing He; Shijie Deng; Feng Xu; Xin Lou; Qionghai Dai

doi:10.1038/s41746-023-00934-4

Relay learning: a physically secure framework for clinical multi-site deep learning

NPJ Digit Med. 2023 Nov 4;6(1):204. doi: 10.1038/s41746-023-00934-4.

Authors

Zi-Hao Bo^{1

2}, Yuchen Guo³, Jinhao Lyu⁴, Hengrui Liang⁵, Jianxing He⁵, Shijie Deng⁶, Feng Xu^{7

8}, Xin Lou⁹, Qionghai Dai^{10

11}

Affiliations

¹ School of Software, Tsinghua University, Beijing, China.
² BNRist, Tsinghua University, Beijing, China.
³ BNRist, Tsinghua University, Beijing, China. Yuchen.w.guo@gmail.com.
⁴ Department of Radiology, Chinese PLA General Hospital / Chinese PLA Medical School, Beijing, China.
⁵ Department of Thoracic Oncology and Surgery, China State Key Laboratory of Respiratory Disease & National Clinical Research Center for Respiratory Disease, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China.
⁶ Department of Radiology, The 921st Hospital of Chinese PLA, Changsha, China.
⁷ School of Software, Tsinghua University, Beijing, China. feng-xu@tsinghua.edu.cn.
⁸ BNRist, Tsinghua University, Beijing, China. feng-xu@tsinghua.edu.cn.
⁹ Department of Radiology, Chinese PLA General Hospital / Chinese PLA Medical School, Beijing, China. louxin@301hospital.com.cn.
¹⁰ BNRist, Tsinghua University, Beijing, China. qhdai@tsinghua.edu.cn.
¹¹ Department of Automation, Tsinghua University, Beijing, China. qhdai@tsinghua.edu.cn.

Abstract

Big data serves as the cornerstone for constructing real-world deep learning systems across various domains. In medicine and healthcare, a single clinical site lacks sufficient data, thus necessitating the involvement of multiple sites. Unfortunately, concerns regarding data security and privacy hinder the sharing and reuse of data across sites. Existing approaches to multi-site clinical learning heavily depend on the security of the network firewall and system implementation. To address this issue, we propose Relay Learning, a secure deep-learning framework that physically isolates clinical data from external intruders while still leveraging the benefits of multi-site big data. We demonstrate the efficacy of Relay Learning in three medical tasks of different diseases and anatomical structures, including structure segmentation of retina fundus, mediastinum tumors diagnosis, and brain midline localization. We evaluate Relay Learning by comparing its performance to alternative solutions through multi-site validation and external validation. Incorporating a total of 41,038 medical images from 21 medical hosts, including 7 external hosts, with non-uniform distributions, we observe significant performance improvements with Relay Learning across all three tasks. Specifically, it achieves an average performance increase of 44.4%, 24.2%, and 36.7% for retinal fundus segmentation, mediastinum tumor diagnosis, and brain midline localization, respectively. Remarkably, Relay Learning even outperforms central learning on external test sets. In the meanwhile, Relay Learning keeps data sovereignty locally without cross-site network connections. We anticipate that Relay Learning will revolutionize clinical multi-site collaboration and reshape the landscape of healthcare in the future.

Abstract

Grants and funding