Patrick McGuire: Cosine Normalization: Using Cosine Similarity Instead of Dot Product in Neural Networks. (arXiv:1702.05870v1 [cs.LG])

Monday, February 20, 2017

Cosine Normalization: Using Cosine Similarity Instead of Dot Product in Neural Networks. (arXiv:1702.05870v1 [cs.LG])

Traditionally, multi-layer neural networks use dot product between the output vector of previous layer and the incoming weight vector as the input to activation function. The result of dot product is unbounded, thus causes large variance. Large variance makes the model sensitive to the change of input distribution, thus results in bad generalization and aggravates the internal covariate shift. To bound dot product, we propose to use cosine similarity instead of dot product in neural network, which we call cosine normalization. Experiments show that cosine normalization in fully connected neural networks can reduce the test err with lower divergence compared with other normalization techniques. Applied to convolutional networks, cosine normalization also significantly enhances the accuracy of classification.

from cs.AI updates on arXiv.org http://ift.tt/2kSVv93
via IFTTT

Patrick McGuire

Latest YouTube Video

Monday, February 20, 2017

Cosine Normalization: Using Cosine Similarity Instead of Dot Product in Neural Networks. (arXiv:1702.05870v1 [cs.LG])

No comments:

Click to Show Support

Click to Show Support