Design of experiments and response surface methodology to tune machine learning hyperparameters, with a random forest case-study

作者:

Highlights:

• Design of experiments identified significant hyperparameters in the random forest.

• No. of features and sampling with replacement were discarded in the screening.

• Interaction between class weights and cutoff had the largest effect on the response.

• Response surface methodology correctly tuned random forest hyperparameters.

• The methodology achieved an outstanding 0.81 cross-validated BACC vs default of 0.64.

摘要

•Design of experiments identified significant hyperparameters in the random forest.•No. of features and sampling with replacement were discarded in the screening.•Interaction between class weights and cutoff had the largest effect on the response.•Response surface methodology correctly tuned random forest hyperparameters.•The methodology achieved an outstanding 0.81 cross-validated BACC vs default of 0.64.

论文关键词:Design of experiments,Hyperparameters,Machine learning,Random forest,Response surface methodology,Tuning

论文评审过程:Received 25 November 2017, Revised 20 May 2018, Accepted 21 May 2018, Available online 26 May 2018, Version of Record 31 May 2018.

论文官网地址:https://doi.org/10.1016/j.eswa.2018.05.024