Mining and modeling linkage information from citation context for improving biomedical literature retrieval

作者:

Highlights:

摘要

Mining linkage information from the citation graph has been shown to be effective in identifying important literatures. However, the question of how to utilize linkage information from the citation graph to facilitate literature retrieval still remains largely unanswered. In this paper, given the context of biomedical literature retrieval, we first conduct a case study in order to find out whether applying PageRank and HITS algorithms directly to the citation graph is the best way of utilizing citation linkage information for improving biomedical literature retrieval. Second, we propose a probabilistic combination framework for integrating citation information into the content-based information retrieval weighting model. Based on the observations of the case study, we present two strategies for modeling the linkage information contained in the citation graph. The proposed framework provides a theoretical support for the combination of content and linkage information. Under this framework, exhaustive parameter tuning can be avoided. Extensive experiments on three TREC Genomics collections demonstrate the advantages and effectiveness of our proposed methods.

论文关键词:Probabilistic model,Citation analysis,Ranking,Biomedical information retrieval

论文评审过程:Received 20 September 2009, Revised 12 February 2010, Accepted 24 March 2010, Available online 11 May 2010.

论文官网地址:https://doi.org/10.1016/j.ipm.2010.03.010