Optimal threshold probability and expectation in semi-Markov decision processes
作者:
Highlights:
•
摘要
We consider undiscounted semi-Markov decision process with a target set and our main concern is a problem minimizing threshold probability. We formulate the problem as an infinite horizon case with a recurrent class. We show that an optimal value function is a unique solution to an optimality equation and there exists a stationary optimal policy. Also several value iteration methods and a policy improvement method are given in our model. Furthermore, we investigate a relationship between threshold probabilities and expectations for total rewards.
论文关键词:Semi-Markov decision process,Optimal threshold probability,Existence of optimal policy,Value iteration,Policy improvement method,Stochastic order
论文评审过程:Available online 14 April 2010.
论文官网地址:https://doi.org/10.1016/j.amc.2010.04.007