Excavating the mother lode of human-generated text: A systematic review of research that uses the wikipedia corpus
作者:
Highlights:
• Wikipedia provides rich, natural semi-structured texts for information retrieval.
• It provides semantic information for keyword extraction from varied texts.
• It facilitates clustering, text classification and semantic relatedness analyses.
• It supplies a semantically structured knowledge base for studying ontologies.
摘要
•Wikipedia provides rich, natural semi-structured texts for information retrieval.•It provides semantic information for keyword extraction from varied texts.•It facilitates clustering, text classification and semantic relatedness analyses.•It supplies a semantically structured knowledge base for studying ontologies.
论文关键词:Information retrieval,Information extraction,Natural language processing,Ontologies,Wikipedia,Literature review
论文评审过程:Received 30 November 2014, Revised 19 July 2016, Accepted 28 July 2016, Available online 27 October 2016, Version of Record 19 January 2017.
论文官网地址:https://doi.org/10.1016/j.ipm.2016.07.003