Pseudo-relevance feedback based query expansion using boosting algorithm

作者:Imran Rasheed, Haider Banka, Hamaid Mahmood Khan

摘要

Retrieving relevant documents from a large set using the original query is a formidable challenge. A generic approach to improve the retrieval process is realized using pseudo-relevance feedback techniques. This technique allows the expansion of original queries with conducive keywords that returns the most relevant documents corresponding to the original query. In this paper, five different hybrid techniques were tested utilizing traditional query expansion methods. Later, the boosting query term method was proposed to reweigh and strengthen the original query. The query-wise analysis revealed that the proposed approach effectively identified the most relevant keywords, and that was true even for short queries. All the proposed methods’ potency was evaluated on three different datasets; Roshni, Hamshahri1, and FIRE2011. Compared to the traditional query expansion methods, the proposed methods improved the mean average precision values of Urdu, Persian, and English datasets by 14.02%, 9.93%, and 6.60%, respectively. The obtained results were also established using analysis of variance and post-hoc analysis.

论文关键词:Term-selection method, Pseudo-relevance feedback, Rank aggregation method, Query formulation, Information retrieval, Urdu language

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10462-021-09972-4