Distributed evidential clustering toward time series with big data issue

作者:

Highlights:

• A distributed evidential clustering algorithm parallelized by Spark is proposed.

• DBPEC analyzes millions of time series without destroying the raw data structure.

• DBPEC generates practical results for big data uniting a fast version of DTW.

• Ambiguity and uncertainty in memberships are better described in DBPEC.

• Credal partitions help users obtain reasonable explanations for real-world problems.

摘要

•A distributed evidential clustering algorithm parallelized by Spark is proposed.•DBPEC analyzes millions of time series without destroying the raw data structure.•DBPEC generates practical results for big data uniting a fast version of DTW.•Ambiguity and uncertainty in memberships are better described in DBPEC.•Credal partitions help users obtain reasonable explanations for real-world problems.

论文关键词:Big data,Distributed calculation,Apache spark,Evidential clustering,Warped dissimilarity measure

论文评审过程:Received 28 March 2021, Revised 20 August 2021, Accepted 21 November 2021, Available online 11 December 2021, Version of Record 15 December 2021.

论文官网地址:https://doi.org/10.1016/j.eswa.2021.116279