Improved Estimates for the Accuracy of Small Disjuncts

作者:J.R. Quinlan

摘要

Learning systems often describe a target class as a disjunction of conjunctions of conditions. Recent work has noted that small disjuncts, i.e., those supported by few training examples, typically have poor predictive accuracy. One model of this accuracy is provided by the Bayes-Laplace formula based on the number of training examples covered by the disjunct and the number of them belonging to the target class. However, experiments show that small disjuncts associated with target classes of different relative frequencies tend to have different error rates. This note defines the context of a disjunct as the set of training examples that fail to satisfy at most one of its conditions. An empirical adaptation of the Bayes-Laplace formula is presented that also makes use of the relative frequency of the target class in this context. Trials are reported comparing the performance of the original formula and the adaptation in six learning tasks.

论文关键词:Disjunctive concepts, empirical learning, estimation

论文评审过程:

论文官网地址:https://doi.org/10.1023/A:1022646118217