Relay learning: a physically secure framework for clinical multi-site deep learning

NPJ Digit Med. 2023 Nov 4;6(1):204. doi: 10.1038/s41746-023-00934-4.

Abstract

Big data serves as the cornerstone for constructing real-world deep learning systems across various domains. In medicine and healthcare, a single clinical site lacks sufficient data, thus necessitating the involvement of multiple sites. Unfortunately, concerns regarding data security and privacy hinder the sharing and reuse of data across sites. Existing approaches to multi-site clinical learning heavily depend on the security of the network firewall and system implementation. To address this issue, we propose Relay Learning, a secure deep-learning framework that physically isolates clinical data from external intruders while still leveraging the benefits of multi-site big data. We demonstrate the efficacy of Relay Learning in three medical tasks of different diseases and anatomical structures, including structure segmentation of retina fundus, mediastinum tumors diagnosis, and brain midline localization. We evaluate Relay Learning by comparing its performance to alternative solutions through multi-site validation and external validation. Incorporating a total of 41,038 medical images from 21 medical hosts, including 7 external hosts, with non-uniform distributions, we observe significant performance improvements with Relay Learning across all three tasks. Specifically, it achieves an average performance increase of 44.4%, 24.2%, and 36.7% for retinal fundus segmentation, mediastinum tumor diagnosis, and brain midline localization, respectively. Remarkably, Relay Learning even outperforms central learning on external test sets. In the meanwhile, Relay Learning keeps data sovereignty locally without cross-site network connections. We anticipate that Relay Learning will revolutionize clinical multi-site collaboration and reshape the landscape of healthcare in the future.