BIRCHSCAN: A sampling method for applying DBSCAN to large datasets
作者:
Highlights:
• A sampling-based method for running DBSCAN on large data sets.
• The BIRCH algorithm is used to build a biased sample.
• It is driven by a unique parameter, a multiplier factor that defines the Threshold used by BIRCH.
• The proposed method has a good trade-off between results quality and running time in larger datasets.
摘要
•A sampling-based method for running DBSCAN on large data sets.•The BIRCH algorithm is used to build a biased sample.•It is driven by a unique parameter, a multiplier factor that defines the Threshold used by BIRCH.•The proposed method has a good trade-off between results quality and running time in larger datasets.
论文关键词:Clustering,Sampling,DBSCAN,BIRCH
论文评审过程:Received 9 March 2021, Revised 29 June 2021, Accepted 29 June 2021, Available online 10 July 2021, Version of Record 14 July 2021.
论文官网地址:https://doi.org/10.1016/j.eswa.2021.115518