Complementing search engines with online web mining agents

作者:

Highlights:

摘要

While search engines have become the major decision support tools for the Internet, there is a growing disparity between the image of the World Wide Web stored in search engine repositories and the actual dynamic, distributed nature of Web data. We propose to attack this problem using an adaptive population of intelligent agents mining the Web online at query time. We discuss the benefits and shortcomings of using dynamic search strategies versus the traditional static methods in which search and retrieval are disjoint. This paper presents a public Web intelligence tool called MySpiders, a threaded multiagent system designed for information discovery. The performance of the system is evaluated by comparing its effectiveness in locating recent, relevant documents with that of search engines. We present results suggesting that augmenting search engines with adaptive populations of intelligent search agents can lead to a significant competitive advantage. We also discuss some of the challenges of evaluating such a system on current Web data, introduce three novel metrics for this purpose, and outline some of the lessons learned in the process.

论文关键词:Web mining,Search engines,Web intelligence,InfoSpiders,MySpiders,Evaluation metrics,Estimated recency,Precision,Recall

论文评审过程:Available online 31 May 2002.

论文官网地址:https://doi.org/10.1016/S0167-9236(02)00106-9