Text document clustering using Spectral Clustering algorithm with Particle Swarm Optimization

作者:

Highlights:

• An automatic text clustering framework for handling unstructured documents.

• Spectral Clustering algorithm SCPSO is proposed based on Particle Swarm Optimization.

• The proposed method is able to group the documents based on their content.

• Cluster Purity is improved by using SCPSO algorithm, while the number of clusters increased.

• The proposed method SCPSO outperforms three challenging methods in terms of Cluster Purity.

摘要

•An automatic text clustering framework for handling unstructured documents.•Spectral Clustering algorithm SCPSO is proposed based on Particle Swarm Optimization.•The proposed method is able to group the documents based on their content.•Cluster Purity is improved by using SCPSO algorithm, while the number of clusters increased.•The proposed method SCPSO outperforms three challenging methods in terms of Cluster Purity.

论文关键词:Text mining,Information retrieval,Text clustering,Spectral clustering,Optimization techniques,SK-means,Expectation-Maximization,Particle Swarm Optimization,SCPSO

论文评审过程:Received 23 November 2018, Revised 22 May 2019, Accepted 23 May 2019, Available online 24 May 2019, Version of Record 6 June 2019.

论文官网地址:https://doi.org/10.1016/j.eswa.2019.05.030