Publicly shared repositories play an important role in advancing performance benchmarks for some of the most important tasks in natural language processing (NLP) and healthcare in general. This study reviews most recent benchmarks based on the 2014 n2c2 de-identification dataset. Pre-processing challenges were uncovered, and attention brought to the discrepancies in reported number of Protected Health Information (PHI) entities among the studies. Improved reporting is required for greater transparency and reproducibility.
Keywords: NLP; Natural language processing; de-identification; i2b2.