Exploiting second-order dissimilarity representations for hierarchical clustering and visualization

作者：Helena Aidos

摘要

The representation of objects is crucial for the learning process, often having a large impact on the application performance. The dissimilarity space (DS) is one of such representations, which is built by applying a dissimilarity measure between objects (e.g., Euclidean distance). However, other measures can be applied to generate more informative data representations. This paper focuses on the application of second-order dissimilarity measures, namely the Shared Nearest Neighbor (SNN) and the Dissimilarity Increments (Dinc), to produce new DSs that lead to a better description of the data, by reducing the overlap of the classes and by increasing the discriminative power of features. Experimental results show that the application of the proposed DSs provide significant benefits for unsupervised learning tasks. When compared with Feature and Euclidean space, the proposed SNN and Dinc spaces allow improving the performance of traditional hierarchical clustering algorithms, and also help in the visualization task, by leading to higher area under the precision/recall curve values.

论文关键词：Dissimilarity representation, Dissimilarity increments, Shared nearest neighbor, Geometrical complexity, Clustering, Visualization

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10618-022-00836-1