Dynamic clustering based contextual combinatorial multi-armed bandit for online recommendation

作者：

Highlights：

•

摘要

Recommender systems still face a trade-off between exploring new items to maximize user satisfaction and exploiting those already interacted with to match user interests. This problem is widely recognized as the exploration/exploitation (EE) dilemma, and the multi-armed bandit (MAB) algorithm has proven to be an effective solution. As the scale of users and items in real-world application scenarios increases, their purchase interactions become sparser. Then three issues need to be investigated when building MAB-based recommender systems. First, large-scale users and sparse interactions increase the difficulty of user preference mining. Second, traditional bandits model items as arms and cannot deal with ever-growing items effectively. Third, widely used Bernoulli-based reward mechanisms only feedback 0 or 1, ignoring rich implicit feedback such as behaviors like click and add-to-cart. To address these problems, we propose an algorithm named Dynamic Clustering based Contextual Combinatorial Multi-Armed Bandits (DC3MAB), which consists of three configurable key components. Specifically, a dynamic user clustering strategy enables different users in the same cluster to cooperate in estimating the expected rewards of arms. A dynamic item partitioning approach based on collaborative filtering significantly reduces the scale of arms and produces a recommendation list instead of one item to provide diversity. In addition, a multi-class reward mechanism based on fine-grained implicit feedback helps better capture user preferences. Extensive empirical experiments on three real-world datasets demonstrate the superiority of our proposed DC3MAB over state-of-the-art bandits (On average, +75.8% in F1 and +54.3% in cumulative reward). The source code is available at https://github.com/HaixHan/DC3MAB.

论文关键词：Online recommendation,Dynamic clustering,Contextual multi-armed bandit,Implicit feedback

论文评审过程：Received 3 May 2022, Revised 26 August 2022, Accepted 18 September 2022, Available online 23 September 2022, Version of Record 6 October 2022.

论文官网地址：https://doi.org/10.1016/j.knosys.2022.109927