Strategy improvement for concurrent reachability and turn-based stochastic safety games
作者:
Highlights:
•
摘要
We consider concurrent games played on graphs. At every round of a game, each player simultaneously and independently selects a move; the moves jointly determine the transition to a successor state. Two basic objectives are the safety objective to stay forever in a given set of states, and its dual, the reachability objective to reach a given set of states. First, we present a simple proof of the fact that in concurrent reachability games, for all ε>0, memoryless ε-optimal strategies exist. A memoryless strategy is independent of the history of plays, and an ε-optimal strategy achieves the objective with probability within ε of the value of the game. In contrast to previous proofs of this fact, our proof is more elementary and more combinatorial. Second, we present a strategy-improvement (a.k.a. policy-iteration) algorithm for concurrent games with reachability objectives. Finally, we present a strategy-improvement algorithm for turn-based stochastic games (where each player selects moves in turns) with safety objectives. Our algorithms yield sequences of player-1 strategies which ensure probabilities of winning that converge monotonically (from below) to the value of the game.
论文关键词:Game theory,Stochastic games,Concurrent games,Reachability and safety objectives,Strategy-improvement algorithms
论文评审过程:Received 20 April 2010, Revised 3 January 2012, Accepted 12 December 2012, Available online 19 December 2012.
论文官网地址:https://doi.org/10.1016/j.jcss.2012.12.001