Efficient GA Based Techniques for Classification

作者:Peter K. Sharpe, Robin P. Glover

摘要

A common approach to evaluating competing models in a classification context is via accuracy on a test set or on cross-validation sets. However, this can be computationally costly when using genetic algorithms with large datasets and the benefits of performing a wide search are compromised by the fact that estimates of the generalization abilities of competing models are subject to noise. This paper shows that clear advantages can be gained by using samples of the test set when evaluating competing models. Further, that applying statistical tests in combination with Occam's razor produces parsimonious models, matches the level of evaluation to the state of the search and retains the speed advantages of test set sampling.

论文关键词:genetic algorithms, classification, data mining

论文评审过程:

论文官网地址:https://doi.org/10.1023/A:1008386925927