Tensor factorization is an important approach to multiway data analysis. Compared with popular multilinear methods, nonlinear tensor factorization models are able to capture more complex relationships in data. However, they are computationally expensive and incapable of exploiting the data sparsity. To overcome these limitations, we propose a new tensor factorization model. The model employs a Gaussian process (GP) to capture the complex nonlinear relationships. The GP can be projected to arbitrary sets of tensor elements, and thus can avoid the expensive computation of the Kronecker product and is able to flexibly incorporate meaningful entries for training. Furthermore, to scale up the model to large data, we develop a distributed variational inference algorithm in MapReduce framework. To this end, we derive a tractable and tight variational evidence lower bound (ELBO) that enables efficient parallel computations and high quality inferences. In addition, we design a non-key-value Map-Reduce scheme that can prevent the costly data shuffling and fully use the memory-cache mechanism in fast MapReduce systems such as SPARK. Experiments demonstrate the advantages of our method over existing approaches in terms of both predictive performance and computational efficiency. Moreover, our approach shows a promising potential in the application of Click-Through-Rate (CTR) prediction for online advertising.
from cs.AI updates on arXiv.org http://ift.tt/24l1JQE
via IFTTT
No comments:
Post a Comment