Location Sensitive Attention

Introduced by Chorowski et al. in Attention-Based Models for Speech Recognition

Location Sensitive Attention is an attention mechanism that extends the additive attention mechanism to use cumulative attention weights from previous decoder time steps as an additional feature. This encourages the model to move forward consistently through the input, mitigating potential failure modes where some subsequences are repeated or ignored by the decoder.

Starting with additive attention where $h$ is a sequential representation from a BiRNN encoder and ${s}_{i-1}$ is the $(i − 1)$-th state of a recurrent neural network (e.g. a LSTM or GRU):

$$ e_{i, j} = w^{T}\tanh\left(W{s}_{i-1} + Vh_{j} + b\right) $$

where $w$ and $b$ are vectors, $W$ and $V$ are matrices. We extend this to be location-aware by making it take into account the alignment produced at the previous step. First, we extract $k$ vectors $f_{i,j} \in \mathbb{R}^{k}$ for every position $j$ of the previous alignment $\alpha_{i−1}$ by convolving it with a matrix $F \in R^{k\times{r}}$:

$$ f_{i} = F ∗ \alpha_{i−1} $$

These additional vectors $f_{i,j}$ are then used by the scoring mechanism $e_{i,j}$:

$$ e_{i,j} = w^{T}\tanh\left(Ws_{i−1} + Vh_{j} + Uf_{i,j} + b\right) $$

Source: Attention-Based Models for Speech Recognition

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Speech Synthesis	14	40.00%
Text-To-Speech Synthesis	4	11.43%
Voice Cloning	2	5.71%
Speech Recognition	2	5.71%
Style Transfer	2	5.71%
Acoustic Modelling	1	2.86%
Voice Conversion	1	2.86%
Transliteration	1	2.86%
Zero-Shot Learning	1	2.86%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
Additive Attention	Attention Mechanisms

Categories

Add Remove

Attention Mechanisms