Leveraging Wikipedia knowledge to classify multilingual biomedical documents

作者:

Highlights:

• We apply a Wikipedia bag-of-concepts representation to classify multilingual biomedical documents.

• We propose a technique to convert Wikipedia concepts to one language to another.

• We create a multilingual corpus about biomedical topics, named ML-UVigoMED.

• We evaluate our approach by conducting experiments in ML-UVigoMED corpus.

• We improve the performance of multilingual biomedical document's classifiers.

摘要

•We apply a Wikipedia bag-of-concepts representation to classify multilingual biomedical documents.•We propose a technique to convert Wikipedia concepts to one language to another.•We create a multilingual corpus about biomedical topics, named ML-UVigoMED.•We evaluate our approach by conducting experiments in ML-UVigoMED corpus.•We improve the performance of multilingual biomedical document's classifiers.

论文关键词:Biomedical document classification,Hybrid word-concept document representation,Multilingual text classification,Wikipedia-based bag of concepts document representation,Wikipedia Miner semantic annotator

论文评审过程:Received 11 August 2017, Revised 6 April 2018, Accepted 23 April 2018, Available online 3 May 2018, Version of Record 7 June 2018.

论文官网地址:https://doi.org/10.1016/j.artmed.2018.04.007