Patrick McGuire: Reducing Redundant Computations with Flexible Attention. (arXiv:1612.06043v1 [cs.CL])

Monday, December 19, 2016

Reducing Redundant Computations with Flexible Attention. (arXiv:1612.06043v1 [cs.CL])

Recently, attention mechanism plays a key role to achieve high performance for Neural Machine Translation models. It applies a score function to the encoder states to obtain alignment weights. However, as this computation is done for all positions in each decoding step, the attention mechanism greatly increases the computational complexity. In this paper we propose a novel attention model which can reduce redundant attentional computations in a flexible manner. The proposed mechanism tracks the center of attention in each decoding step, and computes position-based penalties. In the test time, the computations of the score function for heavily penalized positions are skipped. In our experiments, we found that the computations in the attention model can be reduced by 54% in average with almost no loss of accuracy.

from cs.AI updates on arXiv.org http://ift.tt/2hOyHtp
via IFTTT

Patrick McGuire

Latest YouTube Video

Monday, December 19, 2016

Reducing Redundant Computations with Flexible Attention. (arXiv:1612.06043v1 [cs.CL])

No comments:

Click to Show Support

Click to Show Support