On the efficiency of best-match cluster searches

作者:

Highlights:

摘要

The efficiency of various cluster-based retrieval (CBR) strategies is analyzed. The possibility of combining CBR and inverted index search (IIS) is investigated. A method for combining the two approaches is proposed and shown to be cost effective in terms of paging and CPU time. In the new method, the selection of documents from the best-matching clusters is done using the inverted index for all documents. Although this is counterintuitive to the concept of best-match CBR, the observations prove that it is much more efficient than conventional approaches. In the experiments, the effects of the number of selected clusters, page size, centroid length, and matching function are considered. The experiments show that the storage overhead of the new method would be moderately higher than that of IIS.

论文关键词:

论文评审过程:Received 20 July 1993, Accepted 7 July 1994, Available online 19 July 2002.

论文官网地址:https://doi.org/10.1016/0306-4573(94)90049-3