The use of binary search trees in external distribution sorting

作者:

Highlights:

摘要

A new method of external distribution sorting called tree partitioning is suggested. It involves the use of a binary tree to split an incoming file into successively smaller partitions until these are small enough to be sorted internally. The tree is generated from a sample of data of the same type as those to be sorted using that tree. The method is therefore suitable in cases where some regularity in the data is to be expected, as is the case, for example, with words from a textual or bibliographic database. The number of disc accesses during a tree partitioning sort were calculated in a simulation using files of up to 900,000 text words extracted from British National Bibliography catalogue files. These were compared with the actual numbers of disc accesses used by a standard sort-merge utility. Although the differences were in general not large, the comparison was favourable to tree partitioning in the majority of cases.

论文关键词:

论文评审过程:Received 28 June 1983, Available online 17 July 2002.

论文官网地址:https://doi.org/10.1016/0306-4573(84)90006-2