Programming backgammon using self-teaching neural nets

作者:

摘要

TD-Gammon is a neural network that is able to teach itself to play backgammon solely by playing against itself and learning from the results. Starting from random initial play, TD-Gammon's self-teaching methodology results in a surprisingly strong program: without lookahead, its positional judgement rivals that of human experts, and when combined with shallow lookahead, it reaches a level of play that surpasses even the best human players. The success of TD-Gammon has also been replicated by several other programmers; at least two other neural net programs also appear to be capable of superhuman play.

论文关键词:Reinforcement learning,Temporal difference learning,Neural networks,Backgammon,Games,Doubling strategy,Rollouts

论文评审过程:Available online 28 December 2001.

论文官网地址:https://doi.org/10.1016/S0004-3702(01)00110-2