Optimal threshold probability in undiscounted Markov decision processes with a target set
作者:
Highlights:
•
摘要
We consider risk minimizing problems in undiscounted Markov decisions processes with a target set. We formulate the problem as an infinite horizon case with a recurrent class. We show that an optimal value function is a unique solution to an optimality equation and there exists an stationary optimal policy. Also we give several value iteration methods and a policy improvement method.
论文关键词:Markov decision process,Minimizing risk model,Existence of optimal policy,Value iteration,Policy improvement method
论文评审过程:Available online 3 April 2003.
论文官网地址:https://doi.org/10.1016/S0096-3003(03)00158-9