13 papers with code • 0 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?


Use these libraries to find ListOps models and implementations
2 papers

Most implemented papers

Going Beyond Linear Transformers with Recurrent Fast Weight Programmers

IDSIA/recurrent-fwp NeurIPS 2021

Transformers with linearised attention (''linear Transformers'') have demonstrated the practical scalability and effectiveness of outer product-based Fast Weight Programmers (FWPs) from the '90s.

Simplified State Space Layers for Sequence Modeling

lindermanlab/S5 9 Aug 2022

Models using structured state space sequence (S4) layers have achieved state-of-the-art performance on long-range sequence modeling tasks.

ListOps: A Diagnostic Dataset for Latent Tree Learning

yikangshen/Ordered-Memory NAACL 2018

In this paper we introduce ListOps, a toy dataset created to study the parsing ability of latent tree models.

Ordered Memory

yikangshen/Ordered-Memory NeurIPS 2019

Inspired by Ordered Neurons (Shen et al., 2018), we introduce a new attention-based mechanism and use its cumulative probability to control the writing and erasing operation of the memory.

Modeling Hierarchical Structures with Continuous Recursive Neural Networks

JRC1995/Continuous-RvNN 10 Jun 2021

We also show that CRvNN performs comparably or better than prior latent structure models on real-world tasks such as sentiment analysis and natural language inference.

The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization

robertcsordas/ndr 14 Oct 2021

Despite progress across a broad range of applications, Transformers have limited success in systematic generalization.

ORCHARD: A Benchmark For Measuring Systematic Generalization of Multi-Hierarchical Reasoning

billptw/orchard 28 Nov 2021

The ability to reason with multiple hierarchical structures is an attractive and desirable property of sequential inductive biases for natural language processing.

Dynamic Token Normalization Improves Vision Transformers

wqshao126/dtn ICLR 2022

It is difficult for Transformers to capture inductive bias such as the positional context in an image with LN.

Training Discrete Deep Generative Models via Gapped Straight-Through Estimator

chijames/gst 15 Jun 2022

While deep generative models have succeeded in image processing, natural language processing, and reinforcement learning, training that involves discrete random variables remains challenging due to the high variance of its gradient estimation process.

Sequence Modeling with Multiresolution Convolutional Memory

thjashin/multires-conv 2 May 2023

Popular approaches in the space tradeoff between the memory burden of brute-force enumeration and comparison, as in transformers, the computational burden of complicated sequential dependencies, as in recurrent neural networks, or the parameter burden of convolutional networks with many or large filters.