Semantically Smooth Bilingual Phrase Embeddings Based on Recursive Autoencoders
作者:Qian Lin, Jing Yang, Xiangwen Zhang, Hongji Wang, Yaojie Lu, Jinsong Su
摘要
In this paper, we propose Semantically Smooth Bilingual Recursive Autoencoders to learn bilingual phrase embeddings. The intuition behind our work is to exploit the intrinsic geometric structure of the embedding space and enforce the learned phrase embeddings to be semantically smooth. Specifically, we extend the conventional bilingual recursive autoencoders by preserving the translation and paraphrase probability distributions via regularization terms to simultaneously exploit richer explicit and implicit similarity constraints for bilingual phrase embeddings. To examine the effectiveness of our model, we incorporate two phrase-level similarity features based on the proposed model into a state-of-the-art phrase-based statistical machine translation system. Experiments on NIST Chinese–English test sets show that our model achieves substantial improvements over the baseline.
论文关键词:Bilingual phrase embeddings, Similarity constraints, Machine translation
论文评审过程:
论文官网地址:https://doi.org/10.1007/s11063-020-10210-1