Combining fields for query expansion and adaptive query expansion

作者:

Highlights:

摘要

In this paper, we aim to improve query expansion for ad-hoc retrieval, by proposing a more fine-grained term reweighting process. This fine-grained process uses statistics from the representation of documents in various fields, such as their titles, the anchor text of their incoming links, and their body content. The contribution of this paper is twofold: First, we propose a novel query expansion mechanism on fields by combining field evidence available in a corpora. Second, we propose an adaptive query expansion mechanism that selects an appropriate collection resource, either the local collection, or a high-quality external resource, for query expansion on a per-query basis. The two proposed query expansion approaches are thoroughly evaluated using two standard Text Retrieval Conference (TREC) Web collections, namely the WT10G collection and the large-scale .GOV2 collection. From the experimental results, we observe a statistically significant improvement compared with the baselines. Moreover, we conclude that the adaptive query expansion mechanism is very effective when the external collection used is much larger than the local collection.

论文关键词:Query expansion on fields,Pseudo relevance feedback,Information retrieval,External expansion,Adaptive query expansion,TREC experiments

论文评审过程:Received 19 June 2006, Revised 2 November 2006, Accepted 4 November 2006, Available online 26 January 2007.

论文官网地址:https://doi.org/10.1016/j.ipm.2006.11.002