Near-synonym substitution using a discriminative vector space model

作者:

Highlights:

摘要

Near-synonyms are fundamental and useful knowledge resources for computer-assisted language learning (CALL) applications. For example, in online language learning systems, learners may have a need to express a similar meaning using different words. However, it is usually difficult to choose suitable near-synonyms to fit a given context because the differences of near-synonyms are not easily grasped in practical use, especially for second language (L2) learners. Accordingly, it is worth developing algorithms to verify whether near-synonyms match given contexts. Such algorithms could be used in applications to assist L2 learners in discovering the collocational differences between near-synonyms. We propose a discriminative vector space model for the near-synonym substitution task, and consider this task as a classification task. There are two components: a vector space model and discriminative training. The vector space model is used as a baseline classifier to classify test examples into one of the near-synonyms in a given near-synonym set. A discriminative training technique is then employed to improve the vector space model by distinguishing positive and negative features for each near-synonym. Experimental results show that the DT-VSM achieves higher accuracy than both pointwise mutual information and n-gram-based methods that have been used in previous studies.

论文关键词:Natural language processing,Lexical substitution,Near-synonym learning,Discriminative training,Vector space model

论文评审过程:Received 12 September 2015, Revised 23 March 2016, Accepted 13 May 2016, Available online 20 May 2016, Version of Record 18 June 2016.

论文官网地址:https://doi.org/10.1016/j.knosys.2016.05.025