Applying informetric characteristics of databases to ir system file design, part ii: simulation comparisons

作者:

Highlights:

摘要

Using informetric models of IR system database contents and use, a simulation study was undertaken to examine hypothetical system performance. Combinations of informetric characteristics including term distribution and term selection patterns were used to examine how different informetric environments affect retrieval performance and space requirements for a given set of file structures. It was found that several of the six different structures were better suited for the different environments. The chained hashing structure, in most cases, provided the best retrieval performance and most economical space requirements where gradually decreasing Zipfian term distributions were present, regardless of the term selection distribution. However, the proposed modified hashing structure performed better in situations where steep Zipfian term distributions existed. Two variations of the BIM tree (Balanced Implicit Multiway tree) also provided good performance where steep term distributions were encountered and where the term selection relationship favored the retrieval of more frequently occurring terms.

论文关键词:

论文评审过程:Available online 17 July 2002.

论文官网地址:https://doi.org/10.1016/0306-4573(92)90099-L