Topic-based term translation models for statistical machine translation

作者:

摘要

Term translation is of great importance for machine translation. In this article, we investigate three issues of term translation in the context of statistical machine translation and propose three corresponding models: (a) a term translation disambiguation model which selects desirable translations for terms in the source language with domain information, (b) a term translation consistency model that encourages consistent translations for terms with a high strength of translation consistency throughout a document, and (c) a term unithood model that rewards translation hypotheses where source terms are translated into target strings as a whole unit. We integrate the three models into hierarchical phrase-based SMT and evaluate their effectiveness on NIST Chinese–English translation with large-scale training data. Experiment results show that all three models can achieve substantial improvements over the baseline. Our analyses also suggest that the proposed models are capable of improving term translation.

论文关键词:Term,Term translation disambiguation,Term translation consistency,Term unithood,Statistical machine translation

论文评审过程:Received 24 August 2014, Revised 9 December 2015, Accepted 14 December 2015, Available online 18 December 2015, Version of Record 23 December 2015.

论文官网地址:https://doi.org/10.1016/j.artint.2015.12.002