Mining a Persian–English comparable corpus for cross-language information retrieval

作者:

Highlights:

• We propose novel method for mining high quality translation from comparable corpus.

• We introduce Term Association Network (TAN) for mining Translation knowledge.

• We propose a new method for term translation validity using cross outlier detection.

• Results show that proposed methods significantly outperforms dictionary-based method.

• Our methods are specially effective in translating OOV terms by expanding query words.

摘要

•We propose novel method for mining high quality translation from comparable corpus.•We introduce Term Association Network (TAN) for mining Translation knowledge.•We propose a new method for term translation validity using cross outlier detection.•Results show that proposed methods significantly outperforms dictionary-based method.•Our methods are specially effective in translating OOV terms by expanding query words.

论文关键词:Comparable corpora,Cross-language information retrieval,Term association network,Translation validity check

论文评审过程:Received 16 June 2011, Revised 17 October 2013, Accepted 17 October 2013, Available online 11 November 2013.

论文官网地址:https://doi.org/10.1016/j.ipm.2013.10.002