Pseudonymization of PHI Items in German Clinical Reports

Stud Health Technol Inform. 2021 May 27:281:273-277. doi: 10.3233/SHTI210163.

Abstract

We describe the adaptation of a non-clinical pseudonymization system, originally developed for a German email corpus, for clinical use. This tool replaces previously identified Protected Health Information (PHI) items as carriers of privacy-sensitive information (original names for people, organizations, places, etc.) with semantic type-conformant, yet, fictitious surrogates. We evaluate the generated substitutes for grammatical correctness, semantic and medical plausibility and find particularly low numbers of error instances (less than 1%) on all of these dimensions.

Keywords: German-language clinical reports; Protected Health Information (PHI); pseudonymization of clinical reports; surrogate generation.

MeSH terms

  • Confidentiality*
  • Humans
  • Privacy*