Tree ensembles for predicting structured outputs

作者:

Highlights:

摘要

In this paper, we address the task of learning models for predicting structured outputs. We consider both global and local predictions of structured outputs, the former based on a single model that predicts the entire output structure and the latter based on a collection of models, each predicting a component of the output structure. We use ensemble methods and apply them in the context of predicting structured outputs. We propose to build ensemble models consisting of predictive clustering trees, which generalize classification trees: these have been used for predicting different types of structured outputs, both locally and globally. More specifically, we develop methods for learning two types of ensembles (bagging and random forests) of predictive clustering trees for global and local predictions of different types of structured outputs. The types of outputs considered correspond to different predictive modeling tasks: multi-target regression, multi-target classification, and hierarchical multi-label classification. Each of the combinations can be applied both in the context of global prediction (producing a single ensemble) or local prediction (producing a collection of ensembles). We conduct an extensive experimental evaluation across a range of benchmark datasets for each of the three types of structured outputs. We compare ensembles for global and local prediction, as well as single trees for global prediction and tree collections for local prediction, both in terms of predictive performance and in terms of efficiency (running times and model complexity). The results show that both global and local tree ensembles perform better than the single model counterparts in terms of predictive power. Global and local tree ensembles perform equally well, with global ensembles being more efficient and producing smaller models, as well as needing fewer trees in the ensemble to achieve the maximal performance.

论文关键词:Ensemble methods,Predictive clustering trees,Structured outputs,Multi-target regression,Multi-target classification,Hierarchical multi-label classification

论文评审过程:Received 14 May 2012, Revised 26 September 2012, Accepted 30 September 2012, Available online 10 October 2012.

论文官网地址:https://doi.org/10.1016/j.patcog.2012.09.023