Survey: Leakage and Privacy at Inference Time

Marija Jegorova; Chaitanya Kaul; Charlie Mayor; Alison Q O'Neil; Alexander Weir; Roderick Murray-Smith; Sotirios A Tsaftaris

doi:10.1109/TPAMI.2022.3229593

Survey: Leakage and Privacy at Inference Time

IEEE Trans Pattern Anal Mach Intell. 2023 Jul;45(7):9090-9108. doi: 10.1109/TPAMI.2022.3229593. Epub 2023 Jun 5.

Authors

Marija Jegorova, Chaitanya Kaul, Charlie Mayor, Alison Q O'Neil, Alexander Weir, Roderick Murray-Smith, Sotirios A Tsaftaris

PMID: 37015684
DOI: 10.1109/TPAMI.2022.3229593

Abstract

Leakage of data from publicly available Machine Learning (ML) models is an area of growing significance since commercial and government applications of ML can draw on multiple sources of data, potentially including users' and clients' sensitive data. We provide a comprehensive survey of contemporary advances on several fronts, covering involuntary data leakage which is natural to ML models, potential malicious leakage which is caused by privacy attacks, and currently available defence mechanisms. We focus on inference-time leakage, as the most likely scenario for publicly available models. We first discuss what leakage is in the context of different data, tasks, and model architectures. We then propose a taxonomy across involuntary and malicious leakage, followed by description of currently available defences, assessment metrics, and applications. We conclude with outstanding challenges and open questions, outlining some promising directions for future research.