A Scalable Parallel Algorithm for Self-Organizing Maps with Applications to Sparse Data Mining Problems

作者:R.D. Lawrence, G.S. Almasi, H.E. Rushmeier

摘要

We describe a scalable parallel implementation of the self organizing map (SOM) suitable for data-mining applications involving clustering or segmentation against large data sets such as those encountered in the analysis of customer spending patterns. The parallel algorithm is based on the batch SOM formulation in which the neural weights are updated at the end of each pass over the training data. The underlying serial algorithm is enhanced to take advantage of the sparseness often encountered in these data sets. Analysis of a realistic test problem shows that the batch SOM algorithm captures key features observed using the conventional on-line algorithm, with comparable convergence rates.

论文关键词:parallel processing, parallel IO, scalable data mining, clustering, Kohonen self-organizing maps, data visualization

论文评审过程:

论文官网地址:https://doi.org/10.1023/A:1009817804059