Reinforcing Web-object Categorization Through Interrelationships
作者:GUI-RONG XUE, YONG YU, DOU SHEN, QIANG YANG, HUA-JUN ZENG, ZHENG CHEN
摘要
Existing categorization algorithms deal with homogeneous Web objects, and consider interrelated objects as additional features when taking the interrelationships with other types of objects into account. However, focusing on any single aspect of the inter-object relationship is not sufficient to fully reveal the true categories of Web objects. In this paper, we propose a novel categorization algorithm, called the Iterative Reinforcement Categorization Algorithm (IRC), to exploit the full interrelationship between different types of Web objects on the Web, including Web pages and queries. IRC classifies the interrelated Web objects by iteratively reinforcing the individual classification results of different types of objects via their interrelationship. Experiments on a clickthrough-log dataset from the MSN search engine show that, in terms of the F1 measure, IRC achieves a 26.4% improvement over a pure content-based classification method. It also achieves a 21% improvement over a query-metadata-based method, as well as a 16.4% improvement on F1 measure over the well-known virtual document-based method. Our experiments show that IRC converges fast enough to be applicable to real world applications.
论文关键词:categorization, interrelated Web objects, iterative reinforcement, clickthrough data
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10618-005-0015-5