Efficient discovery of contrast subspaces for object explanation and characterization

作者:Lei Duan, Guanting Tang, Jian Pei, James Bailey, Guozhu Dong, Vinh Nguyen, Akiko Campbell, Changjie Tang

摘要

We tackle the novel problem of mining contrast subspaces. Given a set of multidimensional objects in two classes \(C_+\) and \(C_-\) and a query object \(o\), we want to find the top-\(k\) subspaces that maximize the ratio of likelihood of \(o\) in \(C_+\) against that in \(C_-\). Such subspaces are very useful for characterizing an object and explaining how it differs between two classes. We demonstrate that this problem has important applications, and, at the same time, is very challenging, being MAX SNP-hard. We present CSMiner, a mining method that uses kernel density estimation in conjunction with various pruning techniques. We experimentally investigate the performance of CSMiner on a range of data sets, evaluating its efficiency, effectiveness, and stability and demonstrating it is substantially faster than a baseline method.

论文关键词:Contrast subspace, Kernel density estimation, Likelihood contrast

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-015-0835-6