On document relevance and lexical cohesion between query terms

作者:

Highlights:

摘要

Lexical cohesion is a property of text, achieved through lexical-semantic relations between words in text. Most information retrieval systems make use of lexical relations in text only to a limited extent. In this paper we empirically investigate whether the degree of lexical cohesion between the contexts of query terms’ occurrences in a document is related to its relevance to the query. Lexical cohesion between distinct query terms in a document is estimated on the basis of the lexical-semantic relations (repetition, synonymy, hyponymy and sibling) that exist between there collocates – words that co-occur with them in the same windows of text. Experiments suggest significant differences between the lexical cohesion in relevant and non-relevant document sets exist. A document ranking method based on lexical cohesion shows some performance improvements.

论文关键词:Information retrieval,Lexical cohesion,Word collocation,Document relevance

论文评审过程:Received 20 October 2005, Revised 10 January 2006, Accepted 13 January 2006, Available online 15 March 2006.

论文官网地址:https://doi.org/10.1016/j.ipm.2006.01.008