Personalized Knowledge Distillation for Recommender System

作者:

Highlights:

摘要

Nowadays, Knowledge Distillation (KD) has been widely studied for recommender system. KD is a model-independent strategy that generates a small but powerful student model by transferring knowledge from a pre-trained large teacher model. Recent work has shown that the knowledge from the teacher’s representation space significantly improves the student model. The state-of-the-art method, named Distillation Experts (DE), adopts cluster-wise distillation that transfers the knowledge of each representation cluster separately to distill the various preference knowledge in a balanced manner. However, it is challenging to apply DE to a new environment since its performance is highly dependent on several key assumptions and hyperparameters that need to be tuned for each dataset and each base model. In this work, we propose a novel method, dubbed Personalized Hint Regression (PHR), distilling the preference knowledge in a balanced way without relying on any assumption on the representation space nor any method-specific hyperparameters. To circumvent the clustering, PHR employs personalization network that enables a personalized distillation to the student space for each user/item representation, which can be viewed as a generalization of DE. Extensive experiments conducted on real-world datasets show that PHR achieves comparable or even better performance to DE tuned by a grid search for all of its hyperparameters.

论文关键词:Recommender System,Knowledge Distillation,Model compression,Retrieval efficiency

论文评审过程:Received 20 May 2021, Revised 6 December 2021, Accepted 11 December 2021, Available online 17 December 2021, Version of Record 11 January 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2021.107958