Utility-Privacy Trade-Off in Distributed Machine Learning Systems

Entropy (Basel). 2022 Sep 14;24(9):1299. doi: 10.3390/e24091299.

Abstract

In distributed machine learning (DML), though clients' data are not directly transmitted to the server for model training, attackers can obtain the sensitive information of clients by analyzing the local gradient parameters uploaded by clients. For this case, we use the differential privacy (DP) mechanism to protect the clients' local parameters. In this paper, from an information-theoretic point of view, we study the utility-privacy trade-off in DML with the help of the DP mechanism. Specifically, three cases including independent clients' local parameters with independent DP noise, dependent clients' local parameters with independent/dependent DP noise are considered. Mutual information and conditional mutual information are used to characterize utility and privacy, respectively. First, we show the relationship between utility and privacy for the three cases. Then, we show the optimal noise variance that achieves the maximal utility under a certain level of privacy. Finally, the results of this paper are further illustrated by numerical results.

Keywords: Gaussian noise; differential privacy; distributed machine learning; mutual information; trade-off.

Grants and funding

This paper was supported in part by the National Key R&D Program of China under Grant 2020YFB1806405; in part by the National Natural Science Foundation of China under Grants 62071392, U21A20454; in part by the Natural Science Foundation of Sichuan under Grant 2022NSFSC0484; in part by the central government to guide local scientific and technological development under Grant No. 2021ZYD0001; in part by the 111 Project No.111-2-14; in part by the NSFC-STINT under grant 62011530134, and in part by the Major Key Project of PCL (PCL2021A04).