LS-VisionDraughts: improving the performance of an agent for checkers by integrating computational intelligence, reinforcement learning and a powerful search method

作者:Henrique Castro Neto, Rita Maria Silva Julia, Gutierrez Soares Caexeta, Ayres Roberto Araujo Barcelos

摘要

This paper presents LS-VisionDraughts: an efficient unsupervised evolutionary learning system for Checkers whose contribution is to automate the process of selecting an appropriate representation for the board states – by means of Evolutionary Computation – keeping a deep look-ahead (search depth) at the moment of choosing an adequate move. It corresponds to a player Multi Layer Perceptron Neural Network whose weights are updated through an evaluation function that is automatically adjusted by means of the Temporal Difference methods. A Genetic Algorithm automatically chooses a concise and efficient set of functions, which describe various scenarios associated with Checkers – called features – to represent the board states in the input layer of the Neural Network. It means that each individual of the Genetic Algorithm is a candidate set of features that is associated to a distinct Multi Layer Perceptron Neural Network. The output layer of the Neural Network is a real number (prediction) that indicates to which extent the input state is favorable to provide a better agent performance. In LS-VisionDraughts, a particular version of the search algorithm Alpha-Beta, called fail-soft Alpha-Beta, combined with Table Transposition, Iterative Deepening and ordered tree, uses this prediction value to choose the best move corresponding to the current board state. The best individual is chosen by means of numerous tournaments involving these selfsame Neural Networks. The architecture of LS-VisionDraught is inspired on the agent NeuroDraughts. However, the former system enhances the performance of the latter by automating the selection of the features through Evolutionary Computation and by replacing its Minimax search algorithm with the improved search strategy resumed above. This procedure allows for a 95 % reduction in the search runtime. Further, it remarkably increases the search tree depth. The results obtained from evaluative tournaments confirm the advances of LS-VisionDraughts compared to its opponents. It is however important to point out that LS-VisionDraughts learns practically without human supervision, contrary to the current automatic world champion Chinook, which has been built in a strongly supervised manner.

论文关键词:Artificial intelligence, Artificial neural network, Machine learning, Reinforcement learning, Evolutionary computation, Genetic algorithms, Temporal differences, Game theory

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-014-0536-y