Comparing PSO-based clustering over contextual vector embeddings to modern topic modeling

作者:

Highlights:

• Comparing evolutionary (pPSO) to generative topic modeling (ETM and NVDM).

• Pre-trained language embeddings are efficient in encoding text for evolutionary topic models.

• pPSO generates interpretable topics for health forum and news groups posts.

• The methodology does not require a corpus-specific embedding or vocabulary.

摘要

•Comparing evolutionary (pPSO) to generative topic modeling (ETM and NVDM).•Pre-trained language embeddings are efficient in encoding text for evolutionary topic models.•pPSO generates interpretable topics for health forum and news groups posts.•The methodology does not require a corpus-specific embedding or vocabulary.

论文关键词:Topic modeling,Clustering,Vector embedding,PSO,ETM,NVDM

论文评审过程:Received 3 November 2021, Revised 25 January 2022, Accepted 24 February 2022, Available online 14 March 2022, Version of Record 14 March 2022.

论文官网地址:https://doi.org/10.1016/j.ipm.2022.102921