Formal concept analysis for topic detection: A clustering quality experimental analysis

作者:

Highlights:

• We propose a novel application of FCA-based methods for Topic Detection, overcoming traditional problems of the clustering and classification techniques.

• We achieve state-of-the-art results for the topic detection task at Replab 2013.

• We propose an evaluation framework to measure the quality of the topic detection algorithms, including an external and an internal (quality based) evaluation methodology.

• We conduct an extensive analysis of the performance for the topic detection task of Hierarchical Agglomerative Clustering and Latent Dirichlet Allocation in comparison to FCA.

• We prove that the proposed FCA-based approach is better, in terms of clustering quality, than the two others.

摘要

•We propose a novel application of FCA-based methods for Topic Detection, overcoming traditional problems of the clustering and classification techniques.•We achieve state-of-the-art results for the topic detection task at Replab 2013.•We propose an evaluation framework to measure the quality of the topic detection algorithms, including an external and an internal (quality based) evaluation methodology.•We conduct an extensive analysis of the performance for the topic detection task of Hierarchical Agglomerative Clustering and Latent Dirichlet Allocation in comparison to FCA.•We prove that the proposed FCA-based approach is better, in terms of clustering quality, than the two others.

论文关键词:Formal concept analysis,Topic detection,Clustering quality analysis,Hierarchical agglomerative clustering,Latent dirichlet allocation

论文评审过程:Received 11 May 2015, Revised 1 November 2016, Accepted 31 January 2017, Available online 1 February 2017, Version of Record 11 February 2017.

论文官网地址:https://doi.org/10.1016/j.is.2017.01.008