Improved MCMC sampling methods for estimating weighted sums in Winnow with application to DNF learning

作者:Qingping Tao, Stephen D. Scott

摘要

A Markov chain Monte Carlo method has previously been introduced to estimate weighted sums in multiplicative weight update algorithms when the number of inputs is exponential. However, the original algorithm still required extensive simulation of the Markov chain in order to get accurate estimates of the weighted sums. We propose an optimized version of the original algorithm that produces exactly the same classifications while often using fewer Markov chain simulations. We also apply three other sampling techniques and empirically compare them with the original Metropolis sampler to determine how effective each is in drawing good samples in the least amount of time, in terms of accuracy of weighted sum estimates and in terms of Winnow’s prediction accuracy. We found that two other samplers (Gibbs and Metropolized Gibbs) were slightly better than Metropolis in their estimates of the weighted sums. For prediction errors, there is little difference between any pair of MCMC techniques we tested. Also, on the data sets we tested, we discovered that all approximations of Winnow have no disadvantage when compared to brute force Winnow (where weighted sums are exactly computed), so generalization accuracy is not compromised by our approximation. This is true even when very small sample sizes and mixing times are used.

论文关键词:DNF, Winnow, Markov chain Monte Carlo

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10994-008-5063-9