Translating cross-lingual spelling variants using transformation rules

作者:

Highlights:

摘要

Technical terms and proper names constitute a major problem in dictionary-based cross-language information retrieval (CLIR). However, technical terms and proper names in different languages often share the same Latin or Greek origin, being thus spelling variants of each other. In this paper we present a novel two-step fuzzy translation technique for cross-lingual spelling variants. In the first step, transformation rules are applied to source words to render them more similar to their target language equivalents. The rules are generated automatically using translation dictionaries as source data. In the second step, the intermediate forms obtained in the first step are translated into a target language using fuzzy matching. The effectiveness of the technique was evaluated empirically using five source languages and English as a target language. The two-step technique performed better, in some cases considerably better, than fuzzy matching alone. Even using the first step as such showed promising results.

论文关键词:Cross-language retrieval,Fuzzy matching,Transliteration

论文评审过程:Received 25 August 2003, Accepted 2 February 2004, Available online 20 March 2004.

论文官网地址:https://doi.org/10.1016/j.ipm.2004.02.001