Multi-objective frequent termset clustering
作者:Katharina Morik, Andreas Kaspari, Michael Wurst, Marcin Skirzynski
摘要
Large media collections rapidly evolve in the World Wide Web. In addition to the targeted retrieval as is performed by search engines, browsing and explorative navigation is an important issue. Since the collections grow fast and authors most often do not annotate their web pages according to a given ontology, automatic structuring is in demand as a prerequisite for any pleasant human–computer interface. In this paper, we investigate the problem of finding alternative high-quality structures for navigation in a large collection of high-dimensional data. We express desired properties of frequent termset clustering (FTS) in terms of objective functions. In general, these functions are conflicting. This leads to the formulation of FTS clustering as a multi-objective optimization problem. The optimization is solved by a genetic algorithm. The result is a set of Pareto-optimal solutions. Users may choose their favorite type of a structure for their navigation through a collection or explore the different views given by the different optimal solutions. We explore the capability of the new approach to produce structures that are well suited for browsing on a social bookmarking data set.
论文关键词:Collaborative clustering, Distributed data mining, Ensemble clustering, Multi-media collections
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10115-011-0431-3