Patrick McGuire: Deep Reinforcement Learning from Self-Play in Imperfect-Information Games. (arXiv:1603.01121v1 [cs.LG])

Thursday, March 3, 2016

Deep Reinforcement Learning from Self-Play in Imperfect-Information Games. (arXiv:1603.01121v1 [cs.LG])

Many real-world applications can be described as large-scale games of imperfect information. To deal with these challenging domains, prior work has focused on computing Nash equilibria in a handcrafted abstraction of the domain. In this paper we introduce the first scalable end-to-end approach to learning approximate Nash equilibria without any prior knowledge. Our method combines fictitious self-play with deep reinforcement learning. When applied to Leduc poker, Neural Fictitious Self-Play (NFSP) approached a Nash equilibrium, whereas common reinforcement learning methods diverged. In Limit Texas Holdem, a poker game of real-world scale, NFSP learnt a competitive strategy that approached the performance of human experts and state-of-the-art methods.

Donate to arXiv

from cs.AI updates on arXiv.org http://ift.tt/1QWDK7A
via IFTTT

Patrick McGuire

Latest YouTube Video

Thursday, March 3, 2016

Deep Reinforcement Learning from Self-Play in Imperfect-Information Games. (arXiv:1603.01121v1 [cs.LG])

No comments:

Click to Show Support

Click to Show Support