Intent mining in search query logs for automatic search script generation

作者:Chieh-Jen Wang, Hsin-Hsi Chen

摘要

Capturing users’ information needs is essential in decreasing the barriers in information access. This paper mines sequences of actions called search scripts from search query logs which keep large-scale users’ search experiences. Search scripts can be applied to guide users to satisfy their information needs, improve the search effectiveness of retrieval systems, recommend advertisements at suitable places, and so on. Information quality, query ambiguity, topic diversity, and document relevancy are four major challenging issues in search script mining. In this paper, we determine the relevance of URLs for a query, adopt the Open Directory Project (ODP) categories to disambiguate queries and URLs, explore various features and clustering algorithms for intent clustering, identify critical actions from each intent cluster to form a search script, generate a nature language description for each action, and summarize a topic for each search script. Experiments show that the complete link hierarchical clustering algorithm with the features of query terms, relevant URLs, and disambiguated ODP categories performs the best. Applying the intent clusters created by the best model to intent boundary identification achieves an \(F\) score of  0.6666. The intent clusters then are applied to generate search scripts.

论文关键词:Intent mining, Query log analysis, Search script generation, Web search enhancement

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-013-0620-3