Two-level document ranking using mutual information in natural language information retrieval

作者:

Highlights:

摘要

Information retrieval is to retrieve relevant information that satisfies user's information needs. There arises a problem of how to select only information that is relevant to the user. Ranking techniques are used to find the documents in a collection of documents that are most likely to be relevant to the user's query. However, we find out that there could be retrieved documents whose contexts may not be consistent to the query. Mutual information is a measure which represents the relation between a word and another word. So, we will re-evaluate the relation between the terms in the retrieved document and the terms in the query. In this paper, we discuss a model of natural language information retrieval system that is based on a two-level document ranking method using mutual information. At the first-level, we retrieve documents based on automatically constructed index terms. At the second-level, we reorder the retrieved documents using mutual information. We will show that our method achieves considerable retrieval effectiveness improvement over a traditional linear searching method. Also, we will analyse seven newly developed formulas that reorder the retrieved documents. Among the seven formulas, we will recommend one formula that dominates the others in terms of the retrieval effectiveness.

论文关键词:

论文评审过程:Received 28 May 1996, Accepted 3 December 1996, Available online 11 June 1998.

论文官网地址:https://doi.org/10.1016/S0306-4573(96)00074-X