A Learning Criterion for Stochastic Rules

作者:Kenji Yamanishi

摘要

This paper proposes a learning criterion for stochastic rules. This criterion is developed by extending Valiant's PAC (Probably Approximately Correct) learning model, which is a learning criterion for deterministic rules. Stochastic rules here refer to those which probabilistically asign a number of classes, {Y}, to each attribute vector X. The proposed criterion is based on the idea that learning stochastic rules may be regarded as probably approximately correct identification of conditional probability distributions over classes for given input attribute vectors. An algorithm (an MDL algorithm) based on the MDL (Minimum Description Length) principle is used for learning stochastic rules. Specifically, for stochastic rules with finite partitioning (each of which is specified by a finite number of disjoint cells of the domain and a probability parameter vector associated with them), this paper derives target-dependent upper bounds and worst-case upper bounds on the sample size required by the MDL algorithm to learn stochastic rules with given accuracy and confidence. Based on these sample size bounds, this paper proves polynomial-sample-size learnability of stochastic decision lists (which are newly proposed in this paper as a stochastic analogue of Rivest's decision lists) with at most k literals (k is fixed) in each decision, and polynomial-sample-size learnability of stochastic decision trees (a stochastic analogue of decision trees) with at most k depth. Sufficient conditions for polynomial-sample-size learnability and polynomial-time learnability of any classes of stochastic rules with finite partitioning are also derived.

论文关键词:Learning from examples, stochastic rules, PAC model, MDL principle, stochastic decision lists, stochastic decision trees, sample complexity

论文评审过程:

论文官网地址:https://doi.org/10.1023/A:1022641132503