Patrick McGuire: Image Question Answering: A Visual Semantic Embedding Model and a New Dataset. (arXiv:1505.02074v1 [cs.LG])

Sunday, May 10, 2015

Image Question Answering: A Visual Semantic Embedding Model and a New Dataset. (arXiv:1505.02074v1 [cs.LG])

This work aims to address the problem of image-based question-answering (QA) with new models and datasets. In our work, we propose to use recurrent neural networks and visual semantic embeddings without intermediate stages such as object detection and image segmentation. Our model performs 1.8 times better than the recently published results on the same dataset. Another main contribution is an automatic question generation algorithm that converts the currently available image description dataset into QA form, resulting in a 10 times bigger dataset with more evenly distributed answers.

from cs.AI updates on arXiv.org http://ift.tt/1cnzCdC
via IFTTT

Patrick McGuire

Latest YouTube Video

Sunday, May 10, 2015

Image Question Answering: A Visual Semantic Embedding Model and a New Dataset. (arXiv:1505.02074v1 [cs.LG])

No comments:

Click to Show Support

Click to Show Support