On the undecidability of probabilistic planning and related stochastic optimization problems
作者:
摘要
Automated planning, the problem of how an agent achieves a goal given a repertoire of actions, is one of the foundational and most widely studied problems in the AI literature. The original formulation of the problem makes strong assumptions regarding the agent's knowledge and control over the world, namely that its information is complete and correct, and that the results of its actions are deterministic and known. Recent research in planning under uncertainty has endeavored to relax these assumptions, providing formal and computation models wherein the agent has incomplete or noisy information about the world and has noisy sensors and effectors. This research has mainly taken one of two approaches: extend the classical planning paradigm to a semantics that admits uncertainty, or adopt another framework for approaching the problem, most commonly the Markov Decision Process (MDP) model. This paper presents a complexity analysis of planning under uncertainty. It begins with the “probabilistic classical planning” problem, showing that problem to be formally undecidable. This fundamental result is then applied to a broad class of stochastic optimization problems, in brief any problem statement where the agent (a) operates over an infinite or indefinite time horizon, and (b) has available only probabilistic information about the system's state. Undecidability is established for policy-existence problems for partially observable infinite-horizon Markov decision processes under discounted and undiscounted total reward models, average-reward models, and state-avoidance models. The results also apply to corresponding approximation problems with undiscounted objective functions. The paper answers a significant open question raised by Papadimitriou and Tsitsiklis [Math. Oper. Res. 12 (3) (1987) 441–450] about the complexity of infinite horizon POMDPs.
论文关键词:Probabilistic planning,Undecidability,Computability,Markov decision processes,Computational complexity,Infinity-horizon,Partial observability,Unobservability,Stochastic optimization,Discounted
论文评审过程:Received 22 June 2001, Available online 12 February 2003.
论文官网地址:https://doi.org/10.1016/S0004-3702(02)00378-8