Dependency-based syntax-aware word representations

作者:

摘要

Dependency syntax has been demonstrated highly useful for a number of natural language processing (NLP) tasks. Typical approaches of utilizing dependency syntax include Tree-RNN and Tree-Linearization, both of which exploit explicit 1-best tree outputs from a well-trained parser as inputs. However, these approaches may suffer from error propagation due to the inevitable errors contained in the 1-best tree outputs. In this work, we propose a novel approach to integrate dependency syntax without using the discrete tree outputs. The key idea is to use the intermediate hidden representations of a well-trained encoder-decoder dependency parser, which are referred to as Dependency-based Syntax-Aware Word Representations (Dep-SAWRs). Then, we simply concatenate such Dep-SAWRs with the conventional context-insensitive word embeddings to compose input word representations, without requiring to modify the model architecture of the downstream tasks. We evaluate the proposed method on four kinds of typical NLP tasks, including sentence classification, sentence matching, sequence labeling and machine translation. Experimental results show that the proposed approach is highly promising. On the one hand, it can utilize dependency syntax effectively, bringing consistently better performance on the four tasks compared with baselines without using syntax. On the other hand, the proposed method can outperform the Tree-RNN and Tree-Linearization approaches in most settings, and meanwhile are highly efficient in syntax integration. In addition, the proposed method would be easily extendable to encoding other structural attributes of language.

论文关键词:Dependency syntax,Sentence classification,Sentence matching,Sequence labeling,Neural machine translation,Syntax integration

论文评审过程:Received 21 August 2019, Revised 27 October 2020, Accepted 9 November 2020, Available online 16 November 2020, Version of Record 27 November 2020.

论文官网地址:https://doi.org/10.1016/j.artint.2020.103427