Patrick McGuire: Concentration Bounds for Two Timescale Stochastic Approximation with Applications to Reinforcement Learning. (arXiv:1703.05376v1 [cs.AI])

Thursday, March 16, 2017

Concentration Bounds for Two Timescale Stochastic Approximation with Applications to Reinforcement Learning. (arXiv:1703.05376v1 [cs.AI])

Two-timescale Stochastic Approximation (SA) algorithms are widely used in Reinforcement Learning (RL). In such methods, the iterates consist of two parts that are updated using different stepsizes. We develop the first convergence rate result for these algorithms; in particular, we provide a general methodology for analyzing two-timescale linear SA. We apply our methodology to two-timescale RL algorithms such as GTD(0), GTD2, and TDC.

from cs.AI updates on arXiv.org http://ift.tt/2mOxFO5
via IFTTT

Patrick McGuire

Latest YouTube Video

Thursday, March 16, 2017

Concentration Bounds for Two Timescale Stochastic Approximation with Applications to Reinforcement Learning. (arXiv:1703.05376v1 [cs.AI])

No comments:

Click to Show Support

Click to Show Support