Correlation analysis techniques for uncertain time series
作者:Mahsa Orang, Nematollaah Shiri
摘要
Many applications such as location-based services and wireless sensor networks generate and deal with uncertain time series (UTS), where the “exact” value at each timestamp is unknown. Traditional correlation analysis and search techniques developed for standard time series are inadequate for UTS data analysis required in such applications. Motivated by this need, we propose suitable concepts and techniques for UTS correlation analysis. We formalize the notion of normalization and correlation for UTS in two general settings based on the available information at each timestamp: (1) PDF-based UTS (having probability density function) and (2) multiset-based UTS (having multiset of observed values). For each case, we formulate correlation as a random variable and develop techniques to determine the underlying probability density function. For setup (2), we also present probabilistic pruning and sampling techniques to speed up the search process. We conducted numerous experiments to evaluate the performance of the proposed techniques under different configurations using the UCR benchmark datasets. Our results indicate effectiveness of the proposed techniques. For setup (2), in particular, our results show significant improvement in space utilization and computation time. We believe the proposed ideas and solutions lend themselves to powerful tools for UTS analysis and search tasks.
论文关键词:Correlation analysis, Probabilistic queries, Query optimization, Query processing, Uncertain data
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10115-016-0939-7