Speeding up the Self-Organizing Feature Map Using Dynamic Subset Selection

作者:Leigh Wetmore, Malcolm I. Heywood, A. Nur Zincir-Heywood

摘要

An active learning algorithm is devised for training Self-Organizing Feature Maps on large data sets. Active learning algorithms recognize that not all exemplars are created equal. Thus, the concepts of exemplar age and difficulty are used to filter the original data set such that training epochs are only conducted over a small subset of the original data set. The ensuing Hierarchical Dynamic Subset Selection algorithm introduces definitions for exemplar difficulty suitable to an unsupervised learning context and therefore appropriate Self-organizing map (SOM) stopping criteria. The algorithm is benchmarked on several real world data sets with training set exemplar counts in the region of 30–500 thousand. Cluster accuracy is demonstrated to be at least as good as that from the original SOM algorithm while requiring a fraction of the computational overhead.

论文关键词:active learning, data mining, self organizing feature map

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11063-004-7775-6