Query reformulation using automatically generated query concepts from a document space

作者:

Highlights:

摘要

We propose a new query reformulation approach, using a set of query concepts that are introduced to precisely denote the user’s information need. Since a document collection is considered to be a domain which includes latent primitive concepts, we identify those concepts through a local pattern discovery and a global modeling using data mining techniques. For a new query, we select its most associated primitive concepts and choose the most probable interpretations as query concepts. We discuss the issue of constructing the primitive concepts from either the whole corpus or from the retrieved set of documents. Our experiments are performed on the TREC8 collection. The experimental evaluation shows that our approach is as good as current query reformulation approaches, while being particularly effective for poorly performing queries. Moreover, we find that the approach using the primitive concepts generated from the set of retrieved documents leads to the most effective performance.

论文关键词:Concept-based retrieval,Information extraction,Query reformulation,Query concepts

论文评审过程:Received 21 June 2004, Accepted 21 March 2005, Available online 9 June 2005.

论文官网地址:https://doi.org/10.1016/j.ipm.2005.03.025