Online random forests regression with memories

作者:

Highlights:

摘要

In recent years, the online schema of the conventional Random Forests(RFs) have attracted much attention because of its ability to handle sequential data or data whose distribution changes during the prediction process. However, most research on online RFs focuses on structural modification during the training stage, overlooking critical aspects of the sequential dataset, such as autocorrelation. In this paper, we demonstrate how to improve the predictive accuracy of the regression model by exploiting data correlation. Instead of modifying the structure of the off-line trained RFs, we endow RFs with memory during regression prediction through an online weight learning approach, which is called Online Weight Learning Random Forest Regression(OWL-RFR). Specifically, the weights of leaves are updated based on a novel adaptive stochastic gradient descent method, in which the adaptive learning rate considers the current and historical prediction bias ratios, compared with the static learning rate. Thus, leaf-level weight stores the learned information from the past data points for future correlated prediction. Compared with tree-level weight which only has immediate memory for current prediction, the leaf-level weight can provide long-term memory. Numerical experiments with OWL-RFR show remarkable improvements in predictive accuracy across several common machine learning datasets, compared to traditional RFs and other online approaches. Moreover, our results verify that the weight approach using the long-term memory of leaf-level weight is more effective than immediate dependency on tree-level weight. We show the improved effectiveness of the proposed adaptive learning rate in comparison to the static rate for most datasets, we also show the convergence and stability of our method.

论文关键词:Random forests regression,Long-term memory,Online weight learning,Leaf-level,Adaptive learning rate,Stochastic gradient descent

论文评审过程:Received 3 December 2019, Revised 18 May 2020, Accepted 19 May 2020, Available online 26 May 2020, Version of Record 27 May 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2020.106058