What is this Cluster about? Explaining textual clusters by extracting relevant keywords
作者:
Highlights:
• We propose a score-based, knowledge-based, and (semi) supervised method to explaining text clusters.
• We show how to use external knowledge to expand score-based explanations using an ILP model.
• The ILP model with external knowledge can control the diversity and consistency of explanations.
• In the semi-supervised approach, we have only 9% drop in our metrics by reducing labels by 70%.
• We propose a modification of the current evaluation metrics to reduce bias towards common labels.
摘要
•We propose a score-based, knowledge-based, and (semi) supervised method to explaining text clusters.•We show how to use external knowledge to expand score-based explanations using an ILP model.•The ILP model with external knowledge can control the diversity and consistency of explanations.•In the semi-supervised approach, we have only 9% drop in our metrics by reducing labels by 70%.•We propose a modification of the current evaluation metrics to reduce bias towards common labels.
论文关键词:Document clustering,Text analytics,Explainability,Cluster summarization,Cluster labelling,Clusters explanations
论文评审过程:Received 3 July 2020, Revised 9 October 2020, Accepted 23 July 2021, Available online 4 August 2021, Version of Record 12 August 2021.
论文官网地址:https://doi.org/10.1016/j.knosys.2021.107342