ListOps
13 papers with code • 0 benchmarks • 0 datasets
Benchmarks
These leaderboards are used to track progress in ListOps
Most implemented papers
Going Beyond Linear Transformers with Recurrent Fast Weight Programmers
Transformers with linearised attention (''linear Transformers'') have demonstrated the practical scalability and effectiveness of outer product-based Fast Weight Programmers (FWPs) from the '90s.
Simplified State Space Layers for Sequence Modeling
Models using structured state space sequence (S4) layers have achieved state-of-the-art performance on long-range sequence modeling tasks.
ListOps: A Diagnostic Dataset for Latent Tree Learning
In this paper we introduce ListOps, a toy dataset created to study the parsing ability of latent tree models.
Ordered Memory
Inspired by Ordered Neurons (Shen et al., 2018), we introduce a new attention-based mechanism and use its cumulative probability to control the writing and erasing operation of the memory.
Modeling Hierarchical Structures with Continuous Recursive Neural Networks
We also show that CRvNN performs comparably or better than prior latent structure models on real-world tasks such as sentiment analysis and natural language inference.
The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization
Despite progress across a broad range of applications, Transformers have limited success in systematic generalization.
ORCHARD: A Benchmark For Measuring Systematic Generalization of Multi-Hierarchical Reasoning
The ability to reason with multiple hierarchical structures is an attractive and desirable property of sequential inductive biases for natural language processing.
Dynamic Token Normalization Improves Vision Transformers
It is difficult for Transformers to capture inductive bias such as the positional context in an image with LN.
Training Discrete Deep Generative Models via Gapped Straight-Through Estimator
While deep generative models have succeeded in image processing, natural language processing, and reinforcement learning, training that involves discrete random variables remains challenging due to the high variance of its gradient estimation process.
Sequence Modeling with Multiresolution Convolutional Memory
Popular approaches in the space tradeoff between the memory burden of brute-force enumeration and comparison, as in transformers, the computational burden of complicated sequential dependencies, as in recurrent neural networks, or the parameter burden of convolutional networks with many or large filters.