Subsentence Extraction from Text Using Coverage-Based Deep Learning Language Models

JongYoon Lim; Inkyu Sa; Ho Seok Ahn; Norina Gasteiger; Sanghyub John Lee; Bruce MacDonald

doi:10.3390/s21082712

Subsentence Extraction from Text Using Coverage-Based Deep Learning Language Models

Sensors (Basel). 2021 Apr 12;21(8):2712. doi: 10.3390/s21082712.

Authors

JongYoon Lim¹, Inkyu Sa², Ho Seok Ahn¹, Norina Gasteiger¹, Sanghyub John Lee¹, Bruce MacDonald¹

Affiliations

¹ CARES, Department of Electrical, Computer and Software Engineering, University of Auckland, Auckland 1142, New Zealand.
² CSIRO Data61, Robotics and Autonomous Systems Group, Perception Group, Pullenvale 4069, Australia.

Abstract

Sentiment prediction remains a challenging and unresolved task in various research fields, including psychology, neuroscience, and computer science. This stems from its high degree of subjectivity and limited input sources that can effectively capture the actual sentiment. This can be even more challenging with only text-based input. Meanwhile, the rise of deep learning and an unprecedented large volume of data have paved the way for artificial intelligence to perform impressively accurate predictions or even human-level reasoning. Drawing inspiration from this, we propose a coverage-based sentiment and subsentence extraction system that estimates a span of input text and recursively feeds this information back to the networks. The predicted subsentence consists of auxiliary information expressing a sentiment. This is an important building block for enabling vivid and epic sentiment delivery (within the scope of this paper) and for other natural language processing tasks such as text summarisation and Q&A. Our approach outperforms the state-of-the-art approaches by a large margin in subsentence prediction (i.e., Average Jaccard scores from 0.72 to 0.89). For the evaluation, we designed rigorous experiments consisting of 24 ablation studies. Finally, our learned lessons are returned to the community by sharing software packages and a public dataset that can reproduce the results presented in this paper.

Keywords: bidirectional transformer; human robot interaction; natural language processing; sentiment analysis; span prediction; text extraction.

MeSH terms

Artificial Intelligence*
Deep Learning*
Humans
Language
Natural Language Processing
Research Design

Grants and funding

10077553/The Ministry of Trade, Industry & Energy (MI, Korea)