Supporting knowledge discovery for biodiversity
作者:
Highlights:
• We introduce a methodology for the extraction of the text semantic, summarized in a conceptual graph (CG).
• CGs are derived from a dependency-based parsing that also uses constituency information.
• CGs act as indexes for information retrieval (IR), dealing with text incompleteness and vagueness.
• The resulting IR system is tested on a botanic corpus using a topic set with different levels of difficulty for queries.
• Conceptual retrieval performs better than classic one regardless of the level of difficulty, particularly at the high one.
摘要
•We introduce a methodology for the extraction of the text semantic, summarized in a conceptual graph (CG).•CGs are derived from a dependency-based parsing that also uses constituency information.•CGs act as indexes for information retrieval (IR), dealing with text incompleteness and vagueness.•The resulting IR system is tested on a botanic corpus using a topic set with different levels of difficulty for queries.•Conceptual retrieval performs better than classic one regardless of the level of difficulty, particularly at the high one.
论文关键词:Knowledge discovery,Natural language processing,Text mining
论文评审过程:Received 1 September 2014, Accepted 26 August 2015, Available online 5 September 2015, Version of Record 10 November 2015.
论文官网地址:https://doi.org/10.1016/j.datak.2015.08.002