A tutorial in assessing disclosure risk in microdata

Stat Med. 2018 Nov 10;37(25):3693-3706. doi: 10.1002/sim.7667. Epub 2018 Jun 21.

Abstract

Statistical agencies are releasing statistical data to other agencies for research purposes or to inform public policy. Prior to data release, these agencies have a legal and ethical obligation to protect the confidentiality of individuals in the data. Agencies often release altered versions of the data, but there usually remains risks of disclosure. Many well-studied risk measures are available to assess risk; however, many agencies today continue to use subjective judgement, past experience, and ad hoc rules or checklists to assess disclosure risk. More recently, there has been a recognized demand for quantitative risk measures that provide a more objective criteria for data release. This tutorial provides an overview of the statistical disclosure control framework for microdata. We focus on the risk analysis stage within this framework by defining existing disclosure risk measures and how to estimate them with available software.

Keywords: confidentiality; data protection; disclosure risk; key variables; population unique.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Review

MeSH terms

  • Algorithms
  • Confidentiality* / ethics
  • Disclosure* / ethics
  • Humans
  • Models, Statistical
  • Risk Assessment*
  • Risk Factors
  • Software
  • Statistics as Topic* / ethics