Dependence-biased clustering for variable selection with random forests
作者:
Highlights:
• We introduce a novel conditional permutation measure for variable importance.
• This measure leverages inter-variable dependencies via biased K-means clustering.
• This measure allows to select a small number of relevant, non-redundant variables.
• Extensive results show our variable selection approach is very effective in practice.
摘要
•We introduce a novel conditional permutation measure for variable importance.•This measure leverages inter-variable dependencies via biased K-means clustering.•This measure allows to select a small number of relevant, non-redundant variables.•Extensive results show our variable selection approach is very effective in practice.
论文关键词:Variable selection,Random forest,Permutation importance,Regression,Classification,Clustering
论文评审过程:Received 13 October 2018, Revised 29 May 2019, Accepted 21 July 2019, Available online 24 July 2019, Version of Record 5 August 2019.
论文官网地址:https://doi.org/10.1016/j.patcog.2019.106980