Latest YouTube Video

Wednesday, December 7, 2016

Effect of Reward Function Choices in Risk-Averse Reinforcement Learning. (arXiv:1612.02088v1 [cs.AI])

This paper studies Value-at-Risk problems in finite-horizon Markov decision processes (MDPs) with finite state space and two forms of reward function. Firstly we study the effect of reward function on two criteria in a short-horizon MDP. Secondly, for long-horizon MDPs, we estimate the total reward distribution in a finite-horizon Markov chain (MC) with the help of spectral theory and the central limit theorem, and present a transformation algorithm for the MCs with a three-argument reward function and a salvage reward.



from cs.AI updates on arXiv.org http://ift.tt/2gcAlFo
via IFTTT

No comments: