Learning to Understand Information on the Internet: An Example-Based Approach
作者:Mike Perkowitz, Robert B. Doorenbos, Oren Etzioni, Daniel S. Weld
摘要
The explosive growth of the Web has made intelligent softwareassistants increasingly necessary for ordinary computer users. Bothtraditional approaches—search engines, hierarchical indices—andintelligent software agents require significant amounts of humaneffort to keep up with the Web. As an alternative, we investigate theproblem of automatically learning to interact with informationsources on the Internet. We report on ShopBotand ILA , two implemented agents that learn touse such resources. ShopBot learns how to extract information from onlinevendors using only minimal knowledge about product domains. Giventhe home pages of several online stores, ShopBotautonomously learns how to shop at those vendors. After its learningis complete, ShopBot is able to speedily visitover a dozen software stores and CD vendors, extract productinformation, and summarize the results for the user. ILAlearns to translate information from Internetsources into its own internal concepts. ILAbuilds a model of an information source that specifies the translation between the source's output and ILA 's model of the world. ILA iscapable of leveraging a small amount of knowledge about a domain tolearn models of many information sources. We show that ILA 's learning is fast and accurate, requiring only a smallnumber of queries per information source.
论文关键词:machine learning, internet
论文评审过程:
论文官网地址:https://doi.org/10.1023/A:1008672508721