End-to-End Dialogue Generation Using a Single Encoder and a Decoder Cascade With a Multidimension Attention Mechanism

IEEE Trans Neural Netw Learn Syst. 2023 Nov;34(11):8482-8492. doi: 10.1109/TNNLS.2022.3151347. Epub 2023 Oct 27.

Abstract

Human dialogues often show underlying dependencies between turns, with each interlocutor influencing the queries/responses of the other. This article follows this by proposing a neural architecture for conversation modeling that looks at the dialogue history of both sides. It consists of a generative model where one encoder feeds three decoders to process three successive turns of dialogue for predicting the next utterance, with a multidimension attention mechanism aggregating the past and current contexts for a cascade effect on each decoder. As a result, a more comprehensive account of the dialogue evolution is obtained than by focusing on a single turn or the last encoder context, or on the user side alone. The response generation performance of the model is evaluated on three corpora of different sizes and topics, and a comparison is made with six recent generative neural architectures, using both automatic metrics and human judgments. Our results show that the proposed architecture equals or improves the state-of-the-art for adequacy and fluency, particularly when large open-domain corpora are used in the training. Moreover, it allows better tracking of the dialogue state evolution for response explainability.