Effects of data and entity ablation on multitask learning models for biomedical entity recognition

Nicholas E Rodriguez; Mai Nguyen; Bridget T McInnes

doi:10.1016/j.jbi.2022.104062

Effects of data and entity ablation on multitask learning models for biomedical entity recognition

J Biomed Inform. 2022 Jun:130:104062. doi: 10.1016/j.jbi.2022.104062. Epub 2022 Apr 9.

Authors

Nicholas E Rodriguez¹, Mai Nguyen², Bridget T McInnes³

Affiliations

¹ Department of Computer Science, Virginia Commonwealth University, Richmond 23284, USA.
² San Diego Supercomputer Center, University of California, San Diego 92093, USA.
³ Department of Computer Science, Virginia Commonwealth University, Richmond 23284, USA. Electronic address: btmcinnes@vcu.edu.

PMID: 35413440
DOI: 10.1016/j.jbi.2022.104062

Abstract

Motivation: Training domain-specific named entity recognition (NER) models requires high quality hand curated gold standard datasets which are time-consuming and expensive to create. Furthermore, the storage and memory required to deploy NLP models can be prohibitive when the number of tasks is large. In this work, we explore utilizing multi-task learning to reduce the amount of training data needed to train new domain-specific models. We evaluate our system across 22 distinct biomedical NER datasets and evaluate the extent to which transfer learning helps task performance using two forms of ablation.

Results: We found that multitasking models generally do not improve performance, but in many cases perform on par compared to single-task models. However, we show that in some cases, new unseen tasks can be trained as a single model using less data by starting with weights from a multitask model and improve performance.

Availability: The software underlying this article are available in: https://github.com/NLPatVCU/multitasking_bert-1.

Keywords: Biomedical text processing; Deep learning; Machine learning; Named entity recognition; Natural language processing.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Natural Language Processing*
Software*