FastBTM: Reducing the sampling time for biterm topic model

作者:

Highlights:

• We combine alias method, Metropolis-Hastings and factorized strategy and propose an acceleration algorithm of BTM, FastBTM.

• FastBTM reduces sampling complexity of biterm topic model from O(K) to O(1).

• FastBTM converges faster than BTM without degrading topic quality.

• FastBTM is effective for both short text datasets and long document datasets.

摘要

•We combine alias method, Metropolis-Hastings and factorized strategy and propose an acceleration algorithm of BTM, FastBTM.•FastBTM reduces sampling complexity of biterm topic model from O(K) to O(1).•FastBTM converges faster than BTM without degrading topic quality.•FastBTM is effective for both short text datasets and long document datasets.

论文关键词:BTM,Topic model,Alias method,Metropolis-Hastings,Acceleration algorithm

论文评审过程:Received 23 December 2016, Revised 31 May 2017, Accepted 3 June 2017, Available online 6 June 2017, Version of Record 24 July 2017.

论文官网地址:https://doi.org/10.1016/j.knosys.2017.06.005