d-k-min-wise independent family of hash functions
作者:
Highlights:
• A framework to exponentially improve space and time of min-wise based algorithms.
• Gets around the Ω(log1/ϵ) lower bound needed by min-wise hash functions.
• Only a constant degree of independence is required for the space and time improvements.
• Demonstrates how to apply the framework for similarity and rarity over data stream.
摘要
•A framework to exponentially improve space and time of min-wise based algorithms.•Gets around the Ω(log1/ϵ) lower bound needed by min-wise hash functions.•Only a constant degree of independence is required for the space and time improvements.•Demonstrates how to apply the framework for similarity and rarity over data stream.
论文关键词:min-wise hash functions,Data streams,Windowed data streams,Similarity,Rarity
论文评审过程:Received 19 October 2015, Revised 15 September 2016, Accepted 23 September 2016, Available online 11 October 2016, Version of Record 14 November 2016.
论文官网地址:https://doi.org/10.1016/j.jcss.2016.09.005