A Bayesian stochastic search method for discovering Markov boundaries

作者:

Highlights:

摘要

The discovery of the Markov Boundary (MB) of a target variable using observational data plays a central role in feature selection and local causal structure inference. Most existing methods previously employed for this task rely on statistical independence tests and, in consequence, do not take into account the partial evidence that a finite data set gives about the existence of this kind of probabilistic relationships among random variables. In this work, we employ a novel stochastic search method which explicitly deals with this problem by eliciting multiple alternative Markov boundaries. This technique is based on a Bayesian approach for statistical tests and on a method to score the different alternative solutions. We have also evaluated an interactive procedure for integrating domain or expert knowledge a posteriori (after the learning process), in order to simplify and enrich the set of alternative inferred MBs. In an extensive experimental evaluation we show that this method is able to find a rich and accurate set of alternative MBs which, if properly combined, provide better inferences than other state-of-the-art approaches for this task. Moreover, we think that this new kind of methods, capable of capturing the inherent uncertainty of any real data set and which allows human interventions, can make practitioners feel more confident about the extracted knowledge than fully automatic approaches.

论文关键词:Probabilistic graphical models,Markov boundaries,Feature selection,Bayesian methods,Stochastic search

论文评审过程:Received 14 November 2011, Revised 23 April 2012, Accepted 24 April 2012, Available online 14 May 2012.

论文官网地址:https://doi.org/10.1016/j.knosys.2012.04.028