LINDA-BN: An interpretable probabilistic approach for demystifying black-box predictive models
作者:
Highlights:
• A model-agnostic interpretable probabilistic model based on Bayesian Networks.
• Explanations based on graphical models depicting relationships between features and the class.
• The formalisation of four explainable rules that can provide insights to the decision-maker on whether to trust a prediction.
摘要
The use of sophisticated machine learning models for critical decision-making faces the challenge that these models are often applied as a ‘black-box’. This has led to an increased interest in interpretable machine learning, where post-hoc model-agnostic algorithms present a useful mechanism for generating interpretations of complex learning models. This paper proposes a novel approach based on Bayesian Networks to generate local post-hoc model-agnostic interpretations of a black-box predictive model. Consequently, the proposed approach presents features that are conditionally dependent between each other and that are directly influencing the class variable. This enables the decision-maker to better understand how features are related and why a certain prediction was made. Compared to the existing post-hoc interpretation methods, the contribution of our approach is three-fold: (1) as a probabilistic graphical model, the extracted Bayesian network can provide interpretations through conditional dependencies in a graphical structure regarding what input features and how/why they contributed to a prediction; (2) for complex decision problems with many features, a Markov blanket can be generated from the extracted Bayesian network to provide interpretations with a focused view on those input features that directly contributed to a prediction; (3) the extracted Bayesian network enables the identification of four different rules which can inform the decision-maker about the confidence level in a prediction, thus helping the decision-maker assess the reliability of predictions learned by a black-box model. We implemented the proposed approach, applied it in the context of two well-known public datasets and analysed the results, which are made available in an open-source repository: https://github.com/catarina-moreira/LINDA_DSS.
论文关键词:Interpretable machine learning,Post-hoc interpretation,Probabilistic inference,Bayesian network,Predictive analytics,Explainable AI
论文评审过程:Received 16 July 2020, Revised 3 April 2021, Accepted 4 April 2021, Available online 9 April 2021, Version of Record 24 September 2021.
论文官网地址:https://doi.org/10.1016/j.dss.2021.113561