Interpreting the black box of supervised learning models: Visualizing the impacts of features on prediction

作者:Xiaohang Zhang, Yuan Wang, Zhengren Li

摘要

Machine learning models have been widely used in various domains. However, the internal mechanisms of popular models, such as neural networks and support vector machines, are difficult for humans to understand; such models are often called “black boxes”. In this study, a general method is proposed to gain insight into the black boxes of supervised learning models by visualizing the impacts of input features on their prediction results. Compared with the existing methods, which may overlook the overall understanding of prediction models by analyzing the feature impacts for each individual observation or ignore the impact differences by providing a single impact pattern for all observations, the proposed method distinguishes some typical impact patterns that correspond to different groups of observations. The method maps the detected impact patterns into feature space using tree rules that help locate the impact patterns in the feature space. More importantly, the feature relationships embedded in the prediction models can be revealed through this tree rule-based feature relationship network. We apply the proposed method to various simulated and real data, and the results demonstrate how it can help us understand how features affect model prediction results and the relationships among features.

论文关键词:Feature impact, Model interpretation, Visualization, Machine learning, Black box

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-021-02255-z