Classic machine learning algorithms learn from labelled examples. For example, to design a machine translation system, a typical training set will consist of English sentences and their translation. There is a stronger model, in which the algorithm can also query for labels of new examples it creates. E.g, in the translation task, the algorithm can create a new English sentence, and request its translation from the user during training. This combination of examples and queries has been widely studied. Yet, despite many theoretical results, query algorithms are almost never used. One of the main causes for this is a report (Baum and Lang, 1992) on very disappointing empirical performance of a query algorithm. These poor results were mainly attributed to the fact that the algorithm queried for labels of examples that are artificial, and impossible to interpret by humans.
In this work we study a new model of local membership queries (Awasthi et al., 2012), which tries to resolve the problem of artificial queries. In this model, the algorithm is only allowed to query the labels of examples which are close to examples from the training set. E.g., in translation, the algorithm can change individual words in a sentence it has already seen, and then ask for the translation. In this model, the examples queried by the algorithm will be close to natural examples and hence, hopefully, will not appear as artificial or random. We focus on 1-local queries (i.e., queries of distance 1 from an example in the training sample). We show that 1-local membership queries are already stronger than the standard learning model. We also present an experiment on a well known NLP task of sentiment analysis. In this experiment, the users were asked to provide more information than merely indicating the label. We present results that illustrate that this extra information is beneficial in practice.
from cs.AI updates on arXiv.org http://ift.tt/21tDE9V
via IFTTT
No comments:
Post a Comment