Monitoring named entity recognition: the League Table

J Biomed Semantics. 2013 Sep 13;4(1):19. doi: 10.1186/2041-1480-4-19.

Abstract

Background: Named entity recognition (NER) is an essential step in automatic text processing pipelines. A number of solutions have been presented and evaluated against gold standard corpora (GSC). The benchmarking against GSCs is crucial, but left to the individual researcher. Herewith we present a League Table web site, which benchmarks NER solutions against selected public GSCs, maintains a ranked list and archives the annotated corpus for future comparisons.

Results: The web site enables access to the different GSCs in a standardized format (IeXML). Upon submission of the annotated corpus the user has to describe the specification of the used solution and then uploads the annotated corpus for evaluation. The performance of the system is measured against one or more GSCs and the results are then added to the web site ("League Table"). It displays currently the results from publicly available NER solutions from the Whatizit infrastructure for future comparisons.

Conclusion: The League Table enables the evaluation of NER solutions in a standardized infrastructure and monitors the results long-term. For access please go to http://wwwdev.ebi.ac.uk/Rebholz-srv/calbc/assessmentGSC/.

Contact: rebholz@ifi.uzh.ch.