This paper studies the evaluation of policies which recommend an ordered set of items based on some context---a common scenario in web search, ads, and recommender systems. We develop a novel technique to evaluate such policies offline using logged past data with negligible bias. Our method builds on the assumption that the observed quality of the entire recommended set additively decomposes across items, but per-item quality is not directly observable, and we might not be able to model it from the item's features. Empirical evidence reveals that this assumption fits many realistic scenarios and theoretical analysis shows that we can achieve exponential savings in the amount of required data compared with na\"ive unbiased approaches.
from cs.AI updates on arXiv.org http://ift.tt/1TTLkLc
via IFTTT
No comments:
Post a Comment