Latest YouTube Video

Monday, April 18, 2016

Mastering $2048$ with Delayed Temporal Coherence Learning Multi-State Weight Promotion Redundant Encoding and Carousel Shaping. (arXiv:1604.05085v1 [cs.AI])

$2048$ is an engaging single-player, nondeterministic video puzzle game, which, thanks to the simple rules and hard-to-master gameplay, has gained massive popularity in recent years. As $2048$ can be conveniently embedded into the discrete-state Markov decision processes framework, we treat it as a testbed for evaluating existing and new methods in reinforcement learning. With the aim to develop a strong $2048$ playing program, we employ temporal difference learning with systematic n-tuple networks. We show that this basic method can be significantly improved with temporal coherence learning, multi-stage function approximator with weight promotion, carousel shaping, and redundant encoding. In addition, we demonstrate how to take advantage of the characteristics of the n-tuple network, to improve the algorithmic effectiveness of the learning process by i) delaying the (decayed) update and applying lock-free optimistic parallelism to effortlessly make advantage of multiple CPU cores. This way, we were able to develop the best known $2048$ playing program to date, which confirms the effectiveness of the introduced methods for discrete-state Markov decision problems.



from cs.AI updates on arXiv.org http://ift.tt/23UuRht
via IFTTT

No comments: