Topical keyphrase extraction with hierarchical semantic networks
作者:
Highlights:
• We propose a topical keyphrase extraction method for summarizing documents
• We use hierarchical semantic network to extract the keyphrases for each topic
• Proposed method reflects the intrinsic semantics and relationships of keyphrases
摘要
Topical keyphrase extraction is used to summarize large collections of text documents. However, traditional methods cannot properly reflect the intrinsic semantics and relationships of keyphrases because they rely on a simple term-frequency-based process. Consequently, these methods are not effective in obtaining significant contextual knowledge. To resolve this, we propose a topical keyphrase extraction method based on a hierarchical semantic network and multiple centrality network measures that together reflect the hierarchical semantics of keyphrases. We conduct experiments on real data to examine the practicality of the proposed method and to compare its performance with that of existing topical keyphrase extraction methods. The results confirm that the proposed method outperforms state-of-the-art topical keyphrase extraction methods in terms of the representativeness of the selected keyphrases for each topic. The proposed method can effectively reflect intrinsic keyphrase semantics and interrelationships.
论文关键词:HSN,Hierarchical semantic network,LDA,Latent Dirichlet allocation,TF–ITF,Term frequency–inverse topic frequency,TPR,Topical PageRank,TR,TextRank, WPR: Weighted PageRank,Topical keyphrase extraction,Semantic relationships,Hierarchical networks,Phrase rankings,Text mining
论文评审过程:Received 18 April 2019, Revised 15 September 2019, Accepted 15 September 2019, Available online 31 October 2019, Version of Record 16 November 2019.
论文官网地址:https://doi.org/10.1016/j.dss.2019.113163