Enhancing Targeted Minority Class Prediction in Sentence-Level Relation Extraction

Hyeong-Ryeol Baek; Yong-Suk Choi

doi:10.3390/s22134911

Enhancing Targeted Minority Class Prediction in Sentence-Level Relation Extraction

Sensors (Basel). 2022 Jun 29;22(13):4911. doi: 10.3390/s22134911.

Authors

Hyeong-Ryeol Baek¹, Yong-Suk Choi²

Affiliations

¹ Department of Artificial Intelligence, Hanyang University, Seoul 04763, Korea.
² Department of Computer Science and Engineering, Hanyang University, Seoul 04763, Korea.

Abstract

Sentence-level relation extraction (RE) has a highly imbalanced data distribution that about 80% of data are labeled as negative, i.e., no relation; and there exist minority classes (MC) among positive labels; furthermore, some of MC instances have an incorrect label. Due to those challenges, i.e., label noise and low source availability, most of the models fail to learn MC and get zero or very low F1 scores on MCs. Previous studies, however, have rather focused on micro F1 scores and MCs have not been addressed adequately. To tackle high mis-classification errors for MCs, we introduce (1) a minority class attention module (MCAM), and (2) effective augmentation methods specialized in RE. MCAM calculates the confidence scores on MC instances to select reliable ones for augmentation, and aggregates MCs information in the process of training a model. Our experiments show that our methods achieve a state-of-the-art F1 scores on TACRED as well as enhancing minority class F1 score dramatically.

Keywords: data augmentation; minority class; relation extraction.

MeSH terms

Attention
Language*
Learning*

Grants and funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (*MSIT) (Nos. 2018R1A5A7059549, 2020R1A2C1014037); by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (*MSIT) (No. 2020-0-01373, Artificial Intelligence Graduate School Program (Hanyang University)) (*Ministry of Science and ICT).