Excavating the mother lode of human-generated text: A systematic review of research that uses the wikipedia corpus

作者：

Highlights：

• Wikipedia provides rich, natural semi-structured texts for information retrieval.

• It provides semantic information for keyword extraction from varied texts.

• It facilitates clustering, text classification and semantic relatedness analyses.

• It supplies a semantically structured knowledge base for studying ontologies.

摘要

•Wikipedia provides rich, natural semi-structured texts for information retrieval.•It provides semantic information for keyword extraction from varied texts.•It facilitates clustering, text classification and semantic relatedness analyses.•It supplies a semantically structured knowledge base for studying ontologies.

论文关键词：Information retrieval,Information extraction,Natural language processing,Ontologies,Wikipedia,Literature review

论文评审过程：Received 30 November 2014, Revised 19 July 2016, Accepted 28 July 2016, Available online 27 October 2016, Version of Record 19 January 2017.

论文官网地址：https://doi.org/10.1016/j.ipm.2016.07.003