Random Forests for multiclass classification: Random MultiNomial Logit
作者:
Highlights:
•
摘要
Several supervised learning algorithms are suited to classify instances into a multiclass value space. MultiNomial Logit (MNL) is recognized as a robust classifier and is commonly applied within the CRM (Customer Relationship Management) domain. Unfortunately, to date, it is unable to handle huge feature spaces typical of CRM applications. Hence, the analyst is forced to immerse himself into feature selection. Surprisingly, in sharp contrast with binary logit, current software packages lack any feature-selection algorithm for MultiNomial Logit. Conversely, Random Forests, another algorithm learning multiclass problems, is just like MNL robust but unlike MNL it easily handles high-dimensional feature spaces. This paper investigates the potential of applying the Random Forests principles to the MNL framework. We propose the Random MultiNomial Logit (RMNL), i.e. a random forest of MNLs, and compare its predictive performance to that of (a) MNL with expert feature selection, (b) Random Forests of classification trees. We illustrate the Random MultiNomial Logit on a cross-sell CRM problem within the home-appliances industry. The results indicate a substantial increase in model accuracy of the RMNL model to that of the MNL model with expert feature selection.
论文关键词:Multiclass classifier design and evaluation,Feature evaluation and selection,Data mining methods and algorithms,Customer relationship management (CRM)
论文评审过程:Available online 6 February 2007.
论文官网地址:https://doi.org/10.1016/j.eswa.2007.01.029