Hard Attention

35 papers with code • 0 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Hard Attention models and implementations
2 papers
42

Most implemented papers

Sequence-to-sequence Models for Cache Transition Systems

xiaochang13/CacheTransition-Seq2seq ACL 2018

In this paper, we present a sequence-to-sequence based approach for mapping natural language sentences to AMR semantic graphs.

Latent Alignment and Variational Attention

harvardnlp/var-attn NeurIPS 2018

This work considers variational attention networks, alternatives to soft and hard attention for learning latent variable alignment models, with tighter approximation bounds based on amortized variational inference.

Surprisingly Easy Hard-Attention for Sequence to Sequence Learning

sid7954/beam-joint-attention EMNLP 2018

In this paper we show that a simple beam approximation of the joint distribution between attention and output is an easy, accurate, and efficient attention mechanism for sequence to sequence learning.

Robust Sequence-to-Sequence Acoustic Modeling with Stepwise Monotonic Attention for Neural TTS

keonlee9420/Stepwise_Monotonic_Multihead_Attention 3 Jun 2019

In this paper, we propose a novel stepwise monotonic attention method in sequence-to-sequence acoustic modeling to improve the robustness on out-of-domain inputs.

Graph Representation Learning via Hard and Channel-Wise Attention Networks

dmlc/dgl 5 Jul 2019

To further reduce the requirements on computational resources, we propose the cGAO that performs attention operations along channels.

Neural Architectures for Nested NER through Linearization

ufal/acl2019_nested_ner ACL 2019

We propose two neural network architectures for nested named entity recognition (NER), a setting in which named entities may overlap and also be labeled with more than one label.

Read, Highlight and Summarize: A Hierarchical Neural Semantic Encoder-based Approach

rajeev595/RHS_HierNSE 8 Oct 2019

In this paper, we propose a method based on extracting the highlights of a document; a key concept that is conveyed in a few sentences.

Learning Texture Transformer Network for Image Super-Resolution

researchmm/TTSR CVPR 2020

In this paper, we propose a novel Texture Transformer Network for Image Super-Resolution (TTSR), in which the LR and Ref images are formulated as queries and keys in a transformer, respectively.

AxFormer: Accuracy-driven Approximation of Transformers for Faster, Smaller and more Accurate NLP Models

amrnag/specialized-transformers 7 Oct 2020

We propose AxFormer, a systematic framework that applies accuracy-driven approximations to create optimized transformer models for a given downstream task.

A Hybrid Attention Mechanism for Weakly-Supervised Temporal Action Localization

asrafulashiq/hamnet 3 Jan 2021

Moreover, our temporal semi-soft and hard attention modules, calculating two attention scores for each video snippet, help to focus on the less discriminative frames of an action to capture the full action boundary.