The effect of imbalanced data sets on LDA: A theoretical and empirical analysis

作者:

Highlights:

摘要

This paper demonstrates that the imbalanced data sets have a negative effect on the performance of LDA theoretically. This theoretical analysis is confirmed by the experimental results: using several sampling methods to rebalance the imbalanced data sets, it is found that the performances of LDA on balanced data sets are superior to those of LDA on imbalanced data sets.

论文关键词:Imbalanced data sets,Linear discriminant analysis (LDA),Random sampling,Tomek links,Smote

论文评审过程:Received 13 August 2005, Accepted 17 January 2006, Available online 3 March 2006.

论文官网地址:https://doi.org/10.1016/j.patcog.2006.01.009