Topic change point detection using a mixed Bayesian model
作者:Xiaoling Lu, Yuxuan Guo, Jiayi Chen, Feifei Wang
摘要
Dynamic text documents, including news articles, user reviews, and blogs, are now commonly encountered in many fields. Accordingly, the topics underlying text streams also change over time. To grasp the topic changes in the increasing accumulation of text documents, there is a great need to develop automatic text analysis models to find the key changes in topics. To this end, this study proposes a topic change point detection (Topic-CD) model. Different from previous studies, we define the change point of topics from the perspective of hyperparameters associated with topic-word distributions. This allows the model to detect change points underlying the whole topic set. Under this definition, the topic modeling and change point detection are combined in a unified framework and then performed simultaneously using a Markov chain Monte Carlo algorithm. In addition, the Topic-CD model is free from setting the number of change points in advance, which makes it more convenient for practical use. We investigate the performance of the Topic-CD model numerically using synthetic data and three real datasets. The results show that the Topic-CD model can well identify the change points in topics when compared with several state-of-the-art methods.
论文关键词:Change point detection, Dynamic topic models, Latent Dirichlet allocation, Markov chain Monte Carlo
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10618-021-00804-1