This work considers variational attention networks, alternatives to soft and hard attention for learning latent variable alignment models, with tighter approximation bounds based on amortized variational inference.

Paper
Code

Surprisingly Easy Hard-Attention for Sequence to Sequence Learning

sid7954/beam-joint-attention • • EMNLP 2018

In this paper we show that a simple beam approximation of the joint distribution between attention and output is an easy, accurate, and efficient attention mechanism for sequence to sequence learning.

Paper
Code

Robust Sequence-to-Sequence Acoustic Modeling with Stepwise Monotonic Attention for Neural TTS

keonlee9420/Stepwise_Monotonic_Multihead_Attention • • 3 Jun 2019

In this paper, we propose a novel stepwise monotonic attention method in sequence-to-sequence acoustic modeling to improve the robustness on out-of-domain inputs.

Paper
Code

Graph Representation Learning via Hard and Channel-Wise Attention Networks

dmlc/dgl • • 5 Jul 2019

To further reduce the requirements on computational resources, we propose the cGAO that performs attention operations along channels.

Paper
Code

Neural Architectures for Nested NER through Linearization

ufal/acl2019_nested_ner • ACL 2019

We propose two neural network architectures for nested named entity recognition (NER), a setting in which named entities may overlap and also be labeled with more than one label.

Paper
Code

Read, Highlight and Summarize: A Hierarchical Neural Semantic Encoder-based Approach

rajeev595/RHS_HierNSE • • 8 Oct 2019

In this paper, we propose a method based on extracting the highlights of a document; a key concept that is conveyed in a few sentences.

Paper
Code

Learning Texture Transformer Network for Image Super-Resolution

researchmm/TTSR • • CVPR 2020

In this paper, we propose a novel Texture Transformer Network for Image Super-Resolution (TTSR), in which the LR and Ref images are formulated as queries and keys in a transformer, respectively.

Paper
Code

AxFormer: Accuracy-driven Approximation of Transformers for Faster, Smaller and more Accurate NLP Models

amrnag/specialized-transformers • • 7 Oct 2020

We propose AxFormer, a systematic framework that applies accuracy-driven approximations to create optimized transformer models for a given downstream task.

Paper
Code

A Hybrid Attention Mechanism for Weakly-Supervised Temporal Action Localization

asrafulashiq/hamnet • • 3 Jan 2021

Moreover, our temporal semi-soft and hard attention modules, calculating two attention scores for each video snippet, help to focus on the less discriminative frames of an action to capture the full action boundary.

Paper
Code

Hard Attention

Benchmarks Add a Result

Libraries

Most implemented papers

Content

Benchmarks

Add a Result