PHDMF: A Flexible and Scalable Personal Health Data Management Framework Based on Blockchain Technology

Front Genet. 2022 Apr 13:13:877870. doi: 10.3389/fgene.2022.877870. eCollection 2022.

Abstract

Currently, most of the personal health data (PHD) are managed and stored separately by individual medical institutions. When these data need to be shared, they must be transferred to a trusted management center and approved by data owners through the third-party endorsement technology. Therefore, it is difficult for personal health data to be shared and circulated over multiple medical institutions. On the other hand, the use of directly exchanging and sharing the original data has become inconsistent with the data rapid growth of medical institutions because of the need of massive data transferring across agencies. In order to secure sharing and managing the mass personal health data generated by various medical institutions, a federal personal health data management framework (PHDMF, https://hvic.biosino.org/PHDMF) has been developed, which had the following advantages: 1) the blockchain technology was used to establish a data consortium over multiple medical institutions, which could provide a flexible and scalable technical solution for member extension and solve the problem of third-party endorsement during data sharing; 2) using data distributed storage technology, personal health data could be majorly stored in their original medical institutions, and the massive data transferring process was of no further use, which could match up with the data rapid growth of these institutions; 3) the distributed ledger technology was utilized to record the hash value of data, given the anti-tampering feature of the technology, malicious modification of data could be identified by comparing the hash value; 4) the smart contract technology was introduced to manage users' access and operation of data, which made the data transaction process traceable and solved the problem of data provenance; and 5) a trusted computing environment was provided for meta-analysis with statistic information instead of original data, the trusted computing environment could be further applied to more health data, such as genome sequencing data, protein expression data, and metabolic profile data through combining the federated learning and blockchain technology. In summary, the framework provides a convenient, secure, and trusted environment for health data supervision and circulation, which facilitate the consortium establish over medical institutions and help achieve the value of data sharing and mining.

Keywords: blockchain; data provenance; data sharing; personal health data; smart contract.