A named entity topic model for news popularity prediction

作者:

Highlights:

摘要

Predicting the popularity of web content is widely regarded as an important but challenging task. Online news articles are typical examples of this. In particular, owing to their time-sensitive nature, it is preferable to predict the popularity of news articles before their publication. To achieve this, this study proposes a named entity topic model (NETM) to extract the textual factors that can drive popularity growth. Here, each named entity is assumed to have a popularity-gain distribution over all semantic topics. The popularity of a news article is considered as the accumulation of popularity gains generated by its named entities (NEs) over all the topics. By learning the popularity-gain matrix for each named entity, the popularity of any news article can be predicted. Experiments on two collections of news articles demonstrate that the proposed NETM can outperform existing models in terms of accuracy. Additionally, the popularity-gain matrix learned by the NETM can be used to effectively explain the popularity of specific news articles.

论文关键词:Named entity,Web content,Popularity prediction,Topic model

论文评审过程:Received 14 May 2020, Revised 1 August 2020, Accepted 10 September 2020, Available online 17 September 2020, Version of Record 18 September 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2020.106430