Online calibrated forecasts: Memory efficiency versus universality for learning in games
作者:Shie Mannor, Jeff S. Shamma, Gürdal Arslan
摘要
We provide a simple learning process that enables an agent to forecast a sequence of outcomes. Our forecasting scheme, termed tracking forecast, is based on tracking the past observations while emphasizing recent outcomes. As opposed to other forecasting schemes, we sacrifice universality in favor of a significantly reduced memory requirements. We show that if the sequence of outcomes has certain properties—it has some internal (hidden) state that does not change too rapidly—then the tracking forecast is weakly calibrated so that the forecast appears to be correct most of the time. For binary outcomes, this result holds without any internal state assumptions. We consider learning in a repeated strategic game where each player attempts to compute some forecast of the opponent actions and play a best response to it. We show that if one of the players uses a tracking forecast, while the other player uses a standard learning algorithm (such as exponential regret matching or smooth fictitious play), then the player using the tracking forecast obtains the best response to the actual play of the other players. We further show that if both players use tracking forecast, then under certain conditions on the game matrix, convergence to a Nash equilibrium is possible with positive probability for a larger class of games than the class of games for which smooth fictitious play converges to a Nash equilibrium.
论文关键词:Learning in games, Forecasting, Calibration, Fictitious play, Prediction of universal sequences, Stochastic approximation, The ODE method
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10994-006-0219-y