Optimal threshold probability in undiscounted Markov decision processes with a target set

作者：

Highlights：

•

摘要

We consider risk minimizing problems in undiscounted Markov decisions processes with a target set. We formulate the problem as an infinite horizon case with a recurrent class. We show that an optimal value function is a unique solution to an optimality equation and there exists an stationary optimal policy. Also we give several value iteration methods and a policy improvement method.

论文关键词：Markov decision process,Minimizing risk model,Existence of optimal policy,Value iteration,Policy improvement method

论文评审过程：Available online 3 April 2003.

论文官网地址：https://doi.org/10.1016/S0096-3003(03)00158-9

原文链接
谷歌学术
必应学术
百度学术