Reinforcement learning of non-Markov decision processes

作者:

摘要

Techniques based on reinforcement learning (RL) have been used to build systems that learn to perform nontrivial sequential decision tasks. To date, most of this work has focused on learning tasks that can be described as Markov decision processes. While this formalism is useful for modeling a wide range of control problems, there are important tasks that are inherently non-Markov. We refer to these as hidden state tasks since they arise when information relevant to identifying the state of the environment is hidden (or missing) from the agent's immediate sensation. Two important types of control problems that resist Markov modeling are those in which (1) the system has a high degree of control over the information collected by its sensors (e.g., as in active vision), or (2) the system has a limited set of sensors that do not always provide adequate information about the current state of the environment. Existing RL algorithms perform unreliably on hidden state tasks.

论文关键词:

论文评审过程:Available online 6 April 2000.

论文官网地址:https://doi.org/10.1016/0004-3702(94)00012-P