Cross-language plagiarism detection over continuous-space- and knowledge graph-based representations of language

作者：

Highlights：

• We study the combination of knowledge graph and continuous space representations for cross-language plagiarism detection.

• We also compare methods that only make use of continuous-space representations of text.

• We present the continuous word alignment-based similarity analysis, a model to estimate similarity between text fragments.

• We obtain state-of-the-art performance compared to several state-of-the-art models.

摘要

•We study the combination of knowledge graph and continuous space representations for cross-language plagiarism detection.•We also compare methods that only make use of continuous-space representations of text.•We present the continuous word alignment-based similarity analysis, a model to estimate similarity between text fragments.•We obtain state-of-the-art performance compared to several state-of-the-art models.

论文关键词：Cross-language,Plagiarism detection,Continuous representations,Knowledge graphs,Multilingual semantic network

论文评审过程：Received 13 January 2016, Revised 21 July 2016, Accepted 5 August 2016, Available online 6 August 2016, Version of Record 23 September 2016.

论文官网地址：https://doi.org/10.1016/j.knosys.2016.08.004