The impact of class imbalance in classification performance metrics based on the binary confusion matrix

作者:

Highlights:

• Imbalance coefficient fosters measuring imbalance.

• Geometric Mean and Bookmaker Informedness constitute the best unbiased metrics.

• Matthews Correlation Coefficient is the best option for error consideration.

• The concept of Class Balance Accuracy can be extended to other metrics.

摘要

•Imbalance coefficient fosters measuring imbalance.•Geometric Mean and Bookmaker Informedness constitute the best unbiased metrics.•Matthews Correlation Coefficient is the best option for error consideration.•The concept of Class Balance Accuracy can be extended to other metrics.

论文关键词:Classification,Performance measures,Imbalanced datasets,Class Balance Metrics

论文评审过程:Received 4 September 2018, Revised 22 December 2018, Accepted 22 February 2019, Available online 28 February 2019, Version of Record 2 March 2019.

论文官网地址:https://doi.org/10.1016/j.patcog.2019.02.023