A better fitness measure of a text-document for a given set of keywords

作者:

Highlights:

摘要

We present a new fitness measure BW(D) for a text-document D against a set of keywords W. The fitness evaluation forms a basic operation in information retrieval. The measure BW(D) differs from other measures in that it accounts for both the frequency of the keywords and their clustering characteristics. It also satisfies the important properties of monotonicity and super-additivity, which do not hold for either of the well-known Paice-measure and the mixed-max–min measure. We give efficient algorithms for computing BW(D) and a generalized form BαW(D) of it.

论文关键词:Information retrieval,Fitness measure,Clustering,Algorithm

论文评审过程:Received 7 October 1998, Accepted 18 March 1999, Available online 7 June 2001.

论文官网地址:https://doi.org/10.1016/S0031-3203(99)00087-4