An efficient algorithm for mining frequent weighted itemsets using interval word segments

作者:Ham Nguyen, Bay Vo, Minh Nguyen, Witold Pedrycz

摘要

Mining frequent weighted itemsets (FWIs) from weighted-item transaction databases has recently received research interest. In real-world applications, sparse weighted-item transaction databases (SWITDs) are common. For example, supermarkets have many items, but each transaction has a small number of items. In this paper, we propose an interval word segment (IWS) structure to store and process tidsets for enhancing the effectiveness of mining FWIs from SWITDs. The IWS structure allows the intersection of tidsets between two itemsets to be performed very fast. A map array is proposed for storing a 1-bit index for words. From the map array, 1-bits are mapped to create the tidset of an itemset for faster calculation of the weighted support of itemsets. Experimental results for a number of SWITDs show that the method based on IWS structure outperforms existing methods.

论文关键词:Dynamic bit vector, Frequent weighted itemset, Interval word segments, Multibit segments

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-016-0799-6