Data mining of vector–item patterns using neighborhood histograms
作者:Anne M. Denton, Jianfei Wu
摘要
The representation of multiple continuous attributes as dimensions in a vector space has been among the most influential concepts in machine learning and data mining. We consider sets of related continuous attributes as vector data and search for patterns that relate a vector attribute to one or more items. The presence of an item set defines a subset of vectors that may or may not show unexpected density fluctuations. We test for fluctuations by studying density histograms. A vector–item pattern is considered significant if its density histogram significantly differs from what is expected for a random subset of transactions. Using two different density measures, we evaluate the algorithm on two real data sets and one that was artificially constructed from time series data.
论文关键词:Pattern mining, Pattern significance, Significance of classification, Vector space representation, Gene expression analysis, Time series subsequences
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10115-009-0201-7