Enhancing Text Generation via Parse Tree Embedding

Comput Intell Neurosci. 2022 Jun 10:2022:4096383. doi: 10.1155/2022/4096383. eCollection 2022.

Abstract

Natural language generation (NLG) is a core component of machine translation, dialogue systems, speech recognition, summarization, and so forth. The existing text generation methods tend to be based on recurrent neural language models (NLMs), which generate sentences from encoding vector. However, most of these models lack explicit structured representation for text generation. In this work, we introduce a new generative model for NLG, called Tree-VAE. First it samples a sentence from the training corpus and then generates a new sentence based on the corresponding parse tree embedding vector. Tree-LSTM is used in collaboration with the Stanford Parser to retrieve sentence construction data, which is then used to train a conditional discretization autoencoder generator based on the embeddings of sentence patterns. The proposed model is extensively evaluated on three different datasets. The experimental results proved that the proposed model can generate substantially more diverse and coherent text than existing baseline methods.

MeSH terms

  • Language*
  • Natural Language Processing*
  • Software
  • Translations