MultiWiBi: The multilingual Wikipedia bitaxonomy project

作者:

摘要

We present MultiWiBi, an approach to the automatic creation of two integrated taxonomies for Wikipedia pages and categories written in different languages. In order to create both taxonomies in an arbitrary language, we first build them in English and then project the two taxonomies to other languages automatically, without the help of language-specific resources or tools. The process crucially leverages a novel algorithm which exploits the information available in either one of the taxonomies to reinforce the creation of the other taxonomy. Our experiments show that the taxonomical information in MultiWiBi is characterized by a higher quality and coverage than state-of-the-art resources like DBpedia, YAGO, MENTA, WikiNet, LHD and WikiTaxonomy, also across languages. MultiWiBi is available online at http://wibitaxonomy.org/multiwibi.

论文关键词:Taxonomy extraction,Taxonomy induction,Machine learning,Natural language processing,Collaborative resources,Wikipedia

论文评审过程:Received 14 May 2015, Revised 10 August 2016, Accepted 15 August 2016, Available online 8 September 2016, Version of Record 19 September 2016.

论文官网地址:https://doi.org/10.1016/j.artint.2016.08.004