We propose a goal-driven web navigation as a benchmark task for evaluating an agent with abilities to understand natural language and plan on partially observed environments. In this challenging task, an agent navigates through a web site, which is represented as a graph consisting of web pages as nodes and hyperlinks as directed edges, to find a web page in which a query appears. The agent is required to have sophisticated high-level reasoning based on natural languages and efficient sequential decision making capability to succeed. We release a software tool, called WebNav, that automatically transforms a website into this goal-driven web navigation task, and as an example, we make WikiNav, a dataset constructed from the English Wikipedia containing approximately 5 million articles and more than 12 million queries for training. We evaluate two different agents based on neural networks on the WikiNav and provide the human performance. Our results show the difficulty of the task for both humans and machines. With this benchmark, we expect faster progress in developing artificial agents with natural language understanding and planning skills.
from cs.AI updates on arXiv.org http://ift.tt/1ScrWO0
via IFTTT
No comments:
Post a Comment