Inverse reinforcement learning (IRL) is the problem of recovering a system's latent reward function from observed system behavior. In this paper, we concentrate on IRL in homogeneous large-scale systems, which we refer to as swarms. We show that, by exploiting the inherent homogeneity of a swarm, the IRL objective can be reduced to an equivalent single-agent formulation of constant complexity, which allows us to decompose a global system objective into local subgoals at the agent-level. Based on this finding, we reformulate the corresponding optimal control problem as a fix-point problem pointing towards a symmetric Nash equilibrium, which we solve using a novel heterogeneous learning scheme particularly tailored to the swarm setting. Results on the Vicsek model and the Ising model demonstrate that the proposed framework is able to produce meaningful reward models from which we can learn near-optimal local controllers that replicate the observed system dynamics.
from cs.AI updates on arXiv.org http://ift.tt/1orLi5N
via IFTTT
No comments:
Post a Comment