Federating patients identities: the case of rare diseases

Meriem Maaroufi; Paul Landais; Claude Messiaen; Marie-Christine Jaulent; Rémy Choquet

doi:10.1186/s13023-018-0948-6

Federating patients identities: the case of rare diseases

Orphanet J Rare Dis. 2018 Nov 12;13(1):199. doi: 10.1186/s13023-018-0948-6.

Authors

Meriem Maaroufi^{1

2

3

4}, Paul Landais^{5

6}, Claude Messiaen¹, Marie-Christine Jaulent^{2

3

4}, Rémy Choquet^{1

2

3

4}

Affiliations

¹ Banque Nationale de Données Maladies Rares, Hôpital Necker Enfants Malades, Assistance Publique des Hôpitaux de Paris, Paris, France.
² INSERM, U1142, and UMR_S 1142, LIMICS, Sorbonne University, Paris, France.
³ Pierre and Marie Curie University, Paris, France.
⁴ Paris 13 University, F-93430, Villetaneuse, France.
⁵ UPRES EA2415, Clinical Research University Institute, Montpellier University, 641 avenue du Doyen Gaston Giraud, 34093, Montpellier, France. paul.landais@umontpellier.fr.
⁶ INSERM UMRS 933, Rare Disease Cohorts (RaDiCo), Sorbonne University, and Hôpital Trousseau, Assistance Publique Hôpitaux de Paris, Paris, France. paul.landais@umontpellier.fr.

Abstract

Background: Patient information in rare disease registries is generally collected from numerous data sources, necessitating the data to be federated. In addition, data for research purposes must be de-identified. Transforming nominative data into de-identified data is thus a key issue, while minimizing the number of identity duplicates. We propose a method enabling patient identity federation and rare disease data de-identification while preserving the pertinence of the provided data.

Results: We developed a rare disease patient identifier. The IdMR generation process is a three-phased algorithm involving a hash function to irreversibly de-identify nominative patient data, including those of foetuses. This process minimizes collision risks and reduces variability for the purpose of identity federation. The IdMR was generated for 360,000 patients of the CEMARA database. It allowed identity federation of 1771 duplicated files. No collisions were introduced.

Conclusion: We examined and discussed the risks of collisions and the creation of duplicates as well as the risks of patient re-identification. We discussed our choice of nominative input information in light of that used by other patient identification solutions. The IdMR is a patient identifier that enables identity federation and file linkage. The simplicity of the algorithm and the universality and stability of the input data make it a good candidate for European cross-border rare disease projects.

Keywords: Health information exchange; Identity federation; Patient data privacy; Patient identification systems; Rare diseases.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Databases, Factual
Humans
Rare Diseases*