A comparison of random forest variable selection methods for classification prediction modeling
作者:
Highlights:
• We compare performance for random forest variable selection methods.
• VSURF or Jiang's method are preferable for most datasets.
• varSelRF or Boruta perform well for data with >50 predictors.
• Methods with conditional random forest usually have similar performance.
• Type of methods, test- or performance-based, is not likely to impact performance.
摘要
•We compare performance for random forest variable selection methods.•VSURF or Jiang's method are preferable for most datasets.•varSelRF or Boruta perform well for data with >50 predictors.•Methods with conditional random forest usually have similar performance.•Type of methods, test- or performance-based, is not likely to impact performance.
论文关键词:Random forest,Variable selection,Feature reduction,Classification
论文评审过程:Received 11 October 2018, Revised 21 May 2019, Accepted 22 May 2019, Available online 23 May 2019, Version of Record 6 June 2019.
论文官网地址:https://doi.org/10.1016/j.eswa.2019.05.028