Improved response modeling based on clustering, under-sampling, and ensemble

作者:

Highlights:

摘要

The purpose of response modeling for direct marketing is to identify those customers who are likely to purchase a campaigned product, based upon customers’ behavioral history and other information available. Contrary to mass marketing strategy, well-developed response models used for targeting specific customers can contribute profits to firms by not only increasing revenues, but also lowering marketing costs. Endemic in customer data used for response modeling is a class imbalance problem: the proportion of respondents is small relative to non-respondents. In this paper, we propose a novel data balancing method based on clustering, under-sampling, and ensemble to deal with the class imbalance problem, and thus improve response models. Using publicly available response modeling data sets, we compared the proposed method with other data balancing methods in terms of prediction accuracy and profitability. To investigate the usability of the proposed algorithm, we also employed various prediction algorithms when building the response models. Based on the response rate and profit analysis, we found that our proposed method (1) improved the response model by increasing response rate as well as reducing performance variation, and (2) increased total profit by significantly boosting revenue.

论文关键词:Direct marketing,Response modeling,Class imbalance,Data balancing,CRM,Clustering,Ensemble

论文评审过程:Available online 29 December 2011.

论文官网地址:https://doi.org/10.1016/j.eswa.2011.12.028