Co-training for Implicit Discourse Relation Recognition Based on Manual and Distributed Features

作者:Changxing Wu, Xiaodong Shi, Jinsong Su, Yidong Chen, Yanzhou Huang

摘要

Implicit discourse relation recognition aims to discover the semantic relation between two sentences where the discourse connective is absent. Due to the lack of labeled data, previous work tries to generate additional training data automatically by removing discourse connectives from explicit discourse relation instances. However, using these artificial data indiscriminately has been proven to degrade the performance of implicit discourse relation recognition. To address this problem, we propose a co-training approach based on manual features and distributed features, which identifies useful instances from these artificial data to enlarge the labeled data. In addition, the distributed features are learned via recursive autoencoder based approaches, capable of capturing to some extent the semantics of sentences which is valuable for implicit discourse relation recognition. Experiment results on both the PDTB and CDTB data sets indicate that: (1) The learned distributed features are complementary to the manual features, and thus suitable for co-training. (2) Our proposed co-training approach can use these artificial data effectively, and significantly outperforms the baselines.

论文关键词:Co-training, Artificial implicit data, Distributed features, Implicit discourse relation recognition

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11063-017-9582-x