Low-cost evaluation techniques for information retrieval systems: A review

作者:

Highlights:

摘要

For a system-based information retrieval evaluation, test collection model still remains as a costly task. Producing relevance judgments is an expensive, time consuming task which has to be performed by human assessors. It is not viable to assess the relevancy of every single document in a corpus against each topic for a large collection. In an experimental-based environment, partial judgment on the basis of a pooling method is created to substitute a complete assessment of documents for relevancy. Due to the increasing number of documents, topics, and retrieval systems, the need to perform low-cost evaluations while obtaining reliable results is essential. Researchers are seeking techniques to reduce the costs of experimental IR evaluation process by the means of reducing the number of relevance judgments to be performed or even eliminating them while still obtaining reliable results. In this paper, various state-of-the-art approaches in performing low-cost retrieval evaluation are discussed under each of the following categories; selecting the best sets of documents to be judged; calculating evaluation measures, both, robust to incomplete judgments; statistical inference of evaluation metrics; inference of judgments on relevance, query selection; techniques to test the reliability of the evaluation and reusability of the constructed collections; and other alternative methods to pooling. This paper is intended to link the reader to the corpus of ‘must read’ papers in the area of low-cost evaluation of IR systems.

论文关键词:Retrieval evaluation,Effectiveness metrics,Relevance judgment,Test collection,Pooling

论文评审过程:Received 24 April 2012, Revised 30 November 2012, Accepted 3 December 2012, Available online 10 January 2013.

论文官网地址:https://doi.org/10.1016/j.joi.2012.12.001