Modifications of the construction and voting mechanisms of the Random Forests Algorithm

作者:

Highlights:

摘要

The aim of this work is to propose modifications of the Random Forests algorithm which improve its prediction performance. The suggested modifications intend to increase the strength and decrease the correlation of individual trees of the forest and to improve the function which determines how the outputs of the base classifiers are combined. This is achieved by modifying the node splitting and the voting procedure. Different approaches concerning the number of the predictors and the evaluation measure which determines the impurity of the node are examined. Regarding the voting procedure, modifications based on feature selection, clustering, nearest neighbors and optimization techniques are proposed. The novel feature of the current work is that it proposes modifications, not only for the improvement of the construction or the voting mechanisms but also, for the first time, it examines the overall improvement of the Random Forests algorithm (a combination of construction and voting). We evaluate the proposed modifications using 24 datasets. The evaluation demonstrates that the proposed modifications have positive effect on the performance of the Random Forests algorithm and they provide comparable, and, in most cases, better results than the existing approaches.

论文关键词:Classification,Random Forests,Ensemble methods,Weighted voting,Decision tree

论文评审过程:Received 19 May 2011, Revised 12 July 2013, Accepted 12 July 2013, Available online 6 August 2013.

论文官网地址:https://doi.org/10.1016/j.datak.2013.07.002