Density link-based methods for clustering web pages

作者:

摘要

World Wide Web is a huge information space, making it a valuable resource for decision making. However, it should be effectively managed for such a purpose. One important management technique is clustering the web data. In this paper, we propose some developments in clustering methods to achieve higher qualities. At first we study a new density based method adapted for hierarchical clustering of web documents. Then utilizing the hyperlink structure of web, we propose a new method that incorporates density concepts with web graph. These algorithms have the preference of low complexity and as experimental results reveal, the resultant clusters have high quality.

论文关键词:Web clustering,Density based clustering,Hyperlink structure,Hierarchical clustering

论文评审过程:Received 13 February 2008, Revised 22 February 2009, Accepted 2 April 2009, Available online 8 April 2009.

论文官网地址:https://doi.org/10.1016/j.dss.2009.04.002