Data classification with binary response through the Boosting algorithm and logistic regression

作者:

Highlights:

• Review of AIC and BIC information criteria focused on binary data classification.

• Usual data classification is presented with its drawbacks (i.e., low performance).

• Boosting algorithm showed enhanced results supported by MC simulation.

• Hosmer–Lemeshow test sets the partition of the training(test) for classification.

• CHD disease classification is performed with Boosting showing its high performance.

摘要

•Review of AIC and BIC information criteria focused on binary data classification.•Usual data classification is presented with its drawbacks (i.e., low performance).•Boosting algorithm showed enhanced results supported by MC simulation.•Hosmer–Lemeshow test sets the partition of the training(test) for classification.•CHD disease classification is performed with Boosting showing its high performance.

论文关键词:Boosting algorithm,Data classification,Logistic regression,Information criteria,AIC,BIC,Selection of models,Monte Carlo Simulation

论文评审过程:Received 14 September 2015, Revised 11 April 2016, Accepted 2 August 2016, Available online 13 September 2016, Version of Record 24 October 2016.

论文官网地址:https://doi.org/10.1016/j.eswa.2016.08.014