Patrick McGuire: Unifying Count-Based Exploration and Intrinsic Motivation. (arXiv:1606.01868v1 [cs.AI])

Monday, June 6, 2016

Unifying Count-Based Exploration and Intrinsic Motivation. (arXiv:1606.01868v1 [cs.AI])

We consider an agent's uncertainty about its environment and the problem of generalizing this uncertainty across observations. Specifically, we focus on the problem of exploration in non-tabular reinforcement learning. Drawing inspiration from the intrinsic motivation literature, we use sequential density models to measure uncertainty, and propose a novel algorithm for deriving a pseudo-count from an arbitrary sequential density model. This technique enables us to generalize count-based exploration algorithms to the non-tabular case. We apply our ideas to Atari 2600 games, providing sensible pseudo-counts from raw pixels. We transform these pseudo-counts into intrinsic rewards and obtain significantly improved exploration in a number of hard games, including the infamously difficult Montezuma's Revenge.

from cs.AI updates on arXiv.org http://ift.tt/214gF3b
via IFTTT

Patrick McGuire

Latest YouTube Video

Monday, June 6, 2016

Unifying Count-Based Exploration and Intrinsic Motivation. (arXiv:1606.01868v1 [cs.AI])

No comments:

Click to Show Support

Click to Show Support