A data sampling and attribute selection strategy for improving decision tree construction

作者:

Highlights:

• Decision trees construction outputs suffer from over-fitting and complexity problems.

• Attribute selection and data sampling combination aims to overcome construction problems.

• This paper presents a PSO approach to select an optimal attribute and sample combination.

• The optimal selected solution helps in avoiding over-fitting and complexity problems.

• The empirical results clearly show that the proposed approach shows good solutions.

摘要

•Decision trees construction outputs suffer from over-fitting and complexity problems.•Attribute selection and data sampling combination aims to overcome construction problems.•This paper presents a PSO approach to select an optimal attribute and sample combination.•The optimal selected solution helps in avoiding over-fitting and complexity problems.•The empirical results clearly show that the proposed approach shows good solutions.

论文关键词:Decision tree,Sampling,Attribute selection,Particle swarm optimization,Instantaneous angular seed,Fault diagnosis

论文评审过程:Received 24 November 2018, Revised 30 March 2019, Accepted 30 March 2019, Available online 4 April 2019, Version of Record 6 April 2019.

论文官网地址:https://doi.org/10.1016/j.eswa.2019.03.052