Latent semantic analysis for vector space expansion and fuzzy logic-based genetic clustering
作者:Wei Song, Soon Cheol Park
摘要
This paper proposes an improved latent semantic analysis (LSA) model to represent textual document and takes advantage of a fuzzy logic based genetic algorithm (FLGA) for clustering. The standard genetic algorithm (GA) in conventional vector space model is rather difficult to deal with because the high dimensional encoding of GA makes it explore the optimal solution in a complicated space which is prone to cause an overflow problem. The LSA-based corpus model not only reduces the dimensions drastically, but also creates an underlying semantic structure which enhances its ability of distinguishing documents in terms of concepts and indirectly improves the ability of GA for clustering (genetic clustering). A novel FLGA is proposed in conjunction with this semantic model in this study. According to the nature of biological evolution, several fuzzy controllers are given to adaptively adjust and optimize the behaviors of the GA which can effectively prevent the premature convergence to a suboptimum solution. The experiment results show that the fuzzy logic controllers enhance the ability of the GA to explore the global optimum solution, and the utilization of the LSA-based text representation method to FLGA further improves its clustering performance.
论文关键词:Clustering, Latent semantic analysis, Dimensionality reduction, Genetic algorithm, Fuzzy logic, Multidimensional scaling
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10115-009-0191-5