Latest YouTube Video

Monday, November 7, 2016

Deep Reinforcement Learning with Averaged Target DQN. (arXiv:1611.01929v1 [cs.AI])

The commonly used Q-learning algorithm combined with function approximation induces systematic overestimations of state-action values. These systematic errors might cause instability, poor performance and sometimes divergence of learning. In this work, we present the \textsc{Averaged Target DQN} (ADQN) algorithm, an adaptation to the DQN class of algorithms which uses a weighted average over past learned networks to reduce generalization noise variance. As a consequence, this leads to reduced overestimations, more stable learning process and improved performance. Additionally, we analyze ADQN variance reduction along trajectories and demonstrate the performance of ADQN on a toy Gridworld problem, as well as on several of the Atari 2600 games from the Arcade Learning Environment.



from cs.AI updates on arXiv.org http://ift.tt/2egK64j
via IFTTT

No comments: