Document reranking by term distribution and maximal marginal relevance for chinese information retrieval

作者:

Highlights:

摘要

In this paper, we propose a document reranking method for Chinese information retrieval. The method is based on a term weighting scheme, which integrates local and global distribution of terms as well as document frequency, document positions and term length. The weight scheme allows randomly setting a larger portion of the retrieved documents as relevance feedback, and lifts off the worry that very fewer relevant documents appear in top retrieved documents. It also helps to improve the performance of maximal marginal relevance (MMR) in document reranking. The method was evaluated by MAP (mean average precision), a recall-oriented measure. Significance tests showed that our method can get significant improvement against standard baselines, and outperform relevant methods consistently.

论文关键词:Relevance feedback,Term extraction,Term weighting,Maximal marginal relevance,Chinese information retrieval

论文评审过程:Received 25 May 2006, Accepted 25 July 2006, Available online 19 October 2006.

论文官网地址:https://doi.org/10.1016/j.ipm.2006.07.011