Two-stage multinomial logit model

作者:

Highlights:

摘要

We suggest a two-stage multinomial logit model (TMLM) for incorporating and interpreting both the interaction and main effects in the model for multi-categorized responses. TMLM combines the robustness of multinomial logit model (MLM) with the good properties of decision tree (DT), which makes it possible to cluster homogeneous subjects and thus to incorporate the interaction effects of explanatory variables in MLM. In the first step of TMLM, DT is applied to determine the most influential interaction effects and to create a cluster variable that represents categories with best splits for optimal tree. In the second step, the cluster variable is involved in MLM as an explanatory variable. With TMLM, it is possible to interpret not only the interactions among explanatory variables, but also the main effects. It is also possible to cluster and characterize homogeneous subjects; these would not be possible with MLM. This model also improves the accuracy rate in multi-classification for multi-categorized responses. We apply TMLM to the national pension data of disability pensioners in Korea and compare the results with two types of MLM models. TMLM is suggested as a statistical model for characterizing both the interaction and main effects of explanatory variables and also for improving accuracy rates comparing to MLM.

论文关键词:Interaction effect,Multinomial logit model,Decision tree,Accuracy rate,Multi-classification

论文评审过程:Available online 13 November 2010.

论文官网地址:https://doi.org/10.1016/j.eswa.2010.11.057