Supervised labeled latent Dirichlet allocation for document categorization

作者:Ximing Li, Jihong Ouyang, Xiaotang Zhou, You Lu, Yanhui Liu

摘要

Recently, supervised topic modeling approaches have received considerable attention. However, the representative labeled latent Dirichlet allocation (L-LDA) method has a tendency to over-focus on the pre-assigned labels, and does not give potentially lost labels and common semantics sufficient consideration. To overcome these problems, we propose an extension of L-LDA, namely supervised labeled latent Dirichlet allocation (SL-LDA), for document categorization. Our model makes two fundamental assumptions, i.e., Prior 1 and Prior 2, that relax the restriction of label sampling and extend the concept of topics. In this paper, we develop a Gibbs expectation-maximization algorithm to learn the SL-LDA model. Quantitative experimental results demonstrate that SL-LDA is competitive with state-of-the-art approaches on both single-label and multi-label corpora.

论文关键词:Supervised, Topic modeling, Latent Dirichlet allocation, Multi-label classification

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-014-0595-0