Perceptual similarity between objects is multi-faceted and it is easier to judge similarity when the focus is on a specific aspect. We consider the problem of mapping objects into view specific embeddings where the distance between them is consistent with the similarity comparisons of the form "from the t-th perspective, object A is more similar to B than to C". Our framework jointly learns view specific embeddings and can exploit correlations between views if they exist. Experiments on a number of datasets, including a large dataset of multi-view crowdsourced comparison on bird images, show the proposed method achieves lower triplet generalization error and better grouping of classes in most cases, when compared to learning embeddings independently for each view. The improvements are especially large in the realistic setting when there is limited triplet data for each view.
from cs.AI updates on arXiv.org http://ift.tt/1Bd8Lcd
via IFTTT
No comments:
Post a Comment