On the equivalence between nonnegative tensor factorization and tensorial probabilistic latent semantic analysis

作者:Wei Peng, Tao Li

摘要

Non-negative Matrix Factorization (NMF) and Probabilistic Latent Semantic Analysis (PLSA) are two widely used methods for non-negative data decomposition of two-way data (e.g., document-term matrices). Studies have shown that PLSA and NMF (with the Kullback-Leibler divergence objective) are different algorithms optimizing the same objective function. Recently, analyzing multi-way data (i.e., tensors), has attracted a lot of attention as multi-way data have rich intrinsic structures and naturally appear in many real-world applications. In this paper, the relationships between NMF and PLSA extensions on multi-way data, e.g., NTF (Non-negative Tensor Factorization) and T-PLSA (Tensorial Probabilistic Latent Semantic Analysis), are studied. Two types of T-PLSA models are shown to be equivalent to two well-known non-negative factorization models: PARAFAC and Tucker3 (with the KL-divergence objective). NTF and T-PLSA are also compared empirically in terms of objective functions, decomposition results, clustering quality, and computation complexity on both synthetic and real-world datasets. Finally, we show that a hybrid method by running NTF and T-PLSA alternatively can successfully jump out of each other’s local minima and thus be able to achieve better clustering performance.

论文关键词:Nonnegative tensor factorization, Probabilistic latent semantic analysis, Clustering

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-010-0220-9