Pairwise ranking component analysis

作者:Jean-François Pessiot, Hyeryung Kim, Wataru Fujibuchi

摘要

Uncovering the latent structure of the data is an active research topic in data mining. However, in the distance metric learning framework, previous studies have mainly focused on the classification performance. In this work, we consider the distance metric learning problem in the ranking setting, where predicting the order between the data vectors is more important than predicting the class labels. We focus on two problems: improving the ranking prediction accuracy and identifying the latent structure of the data. The core of our model consists of ranking the data using a Mahalanobis distance function. The additional use of non-negativity constraints and an entropy-based cost function allows us to simultaneously minimize the ranking error while identifying useful meta-features. To demonstrate its usefulness for information retrieval applications, we compare the performance of our method with four other methods on four UCI data sets, three text data sets, and four image data sets. Our approach shows good ranking accuracies, especially when few training data are available. We also use our model to extract and interpret the latent structure of the data sets. In addition, our approach is simple to implement and computationally efficient and can be used for data embedding and visualization.

论文关键词:Ranking, Distance metric learning, Feature clustering, Dimensionality reduction, Information retrieval

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-012-0574-x