FedDroidMeter: A Privacy Risk Evaluator for FL-Based Android Malware Classification Systems

Changnan Jiang; Chunhe Xia; Zhuodong Liu; Tianbo Wang

doi:10.3390/e25071053

FedDroidMeter: A Privacy Risk Evaluator for FL-Based Android Malware Classification Systems

Entropy (Basel). 2023 Jul 12;25(7):1053. doi: 10.3390/e25071053.

Authors

Changnan Jiang¹, Chunhe Xia^{1

2}, Zhuodong Liu¹, Tianbo Wang^{3

4}

Affiliations

¹ Key Laboratory of Beijing Network Technology, Beihang University, Beijing 100191, China.
² Guangxi Key Lab of Multi-Source Information Mining and Security, Guangxi Normal University, Guilin 541004, China.
³ Shanghai Key Laboratory of Computer Software Evaluating and Testing, Shanghai 201112, China.
⁴ School of Cyber Science and Technology, Beihang University, Beijing 100191, China.

Abstract

In traditional centralized Android malware classifiers based on machine learning, the training sample uploaded by users contains sensitive personal information, such as app usage and device security status, which will undermine personal privacy if used directly by the server. Federated-learning-based Android malware classifiers have attracted much attention due to their privacy-preserving and multi-party joint modeling. However, research shows that indirect privacy inferences from curious central servers threaten this framework. We propose a privacy risk evaluation framework, FedDroidMeter, based on normalized mutual information in response to user privacy requirements to measure the privacy risk in FL-based malware classifiers. It captures the essential cause of the disclosure of sensitive information in classifiers, independent of the attack model and capability. We performed numerical assessments using the Androzoo dataset, the baseline FL-based classifiers, the privacy-inferred attack model, and the baseline methodology of privacy evaluation. The experimental results show that FedDroidMeter can measure the privacy risks of the classifiers more effectively. Meanwhile, by comparing different models, FL, and privacy parameter settings, we proved that FedDroidMeter could compare the privacy risk between different use cases equally. Finally, we preliminarily study the law of privacy risk in classifiers. The experimental results emphasize the importance of providing a systematic privacy risk evaluation framework for FL-based malware classifiers and provide experience and a theoretical basis for studying targeted defense methods.

Keywords: federated learning; malware classification; privacy risk; sensitive information.

Grants and funding

no.62272024/National Natural Science Foundation of China