Dynamic randomization and domain knowledge in Monte-Carlo Tree Search for Go knowledge-based systems

作者:

Highlights:

摘要

This paper is an extension of the article [13] presented at IWCG of TAAI 2010. It proposes two dynamic randomization techniques for Monte-Carlo Tree Search (MCTS) in Go. First, during the in-tree phase of a simulation game, the parameters are randomized in selected ranges before each simulation move. Second, during the play-out phase, the priority orders of the simulation move-generators are hierarchically randomized before each play-out move. Essential domain knowledge used in MCTS for Go is discussed. Both dynamic randomization techniques increase diversity while keeping the sanity of the simulation games. Experimental testing has been completely re-conducted more extensively with the latest version of GoIntellect (GI) on all three Go categories of 19 × 19, 13 × 13, and 9 × 9 boards. The results show that dynamic randomization increases the playing strength of GI significantly with 128K simulations per move, the improvement is about seven percentage points in the winning rate against GnuGo on 19 × 19 Go over the version of GI without dynamic randomization, about three percentage points on 13 × 13 Go, and four percentage points on 9 × 9 Go.

论文关键词:Monte-Carlo Tree Search,UCT algorithm,Simulation game,Domain knowledge,Go,Search parameters,Move generators,Dynamic randomization

论文评审过程:Received 25 May 2011, Revised 25 July 2011, Accepted 14 August 2011, Available online 27 August 2011.

论文官网地址:https://doi.org/10.1016/j.knosys.2011.08.007