A risk minimization framework for information retrieval

作者:

Highlights:

摘要

This paper presents a probabilistic information retrieval framework in which the retrieval problem is formally treated as a statistical decision problem. In this framework, queries and documents are modeled using statistical language models, user preferences are modeled through loss functions, and retrieval is cast as a risk minimization problem. We discuss how this framework can unify existing retrieval models and accommodate systematic development of new retrieval models. As an example of using the framework to model non-traditional retrieval problems, we derive retrieval models for subtopic retrieval, which is concerned with retrieving documents to cover many different subtopics of a general query topic. These new models differ from traditional retrieval models in that they relax the traditional assumption of independent relevance of documents.

论文关键词:Retrieval models,Risk minimization,Statistical language models,Bayesian decision theory

论文评审过程:Accepted 12 November 2004, Available online 6 January 2005.

论文官网地址:https://doi.org/10.1016/j.ipm.2004.11.003