16k
77 papers with code • 1 benchmarks • 1 datasets
Subtasks
Most implemented papers
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
We also extend FlashAttention to block-sparse attention, yielding an approximate attention algorithm that is faster than any existing approximate attention method.
Long Range Arena: A Benchmark for Efficient Transformers
In the recent months, a wide spectrum of efficient, fast Transformers have been proposed to tackle this problem, more often than not claiming superior or comparable model quality to vanilla Transformer models.
Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset
In this work, we introduce the the Schema-Guided Dialogue (SGD) dataset, containing over 16k multi-domain conversations spanning 16 domains.
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
In this paper, we introduce LongBench, the first bilingual, multi-task benchmark for long context understanding, enabling a more rigorous evaluation of long context understanding.
Long-form factuality in large language models
Empirically, we demonstrate that LLM agents can outperform crowdsourced human annotators - on a set of ~16k individual facts, SAFE agrees with crowdsourced human annotators 72% of the time, and on a random subset of 100 disagreement cases, SAFE wins 76% of the time.
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
We evaluate our instantiations at the scale of 125M to 1. 3B parameters, comparing with a strong Transformer and Mamba, a modern RNN.
Visual Semantic Role Labeling
In this paper we introduce the problem of Visual Semantic Role Labeling: given an image we want to detect people doing actions and localize the objects of interaction.
Fighting the COVID-19 Infodemic: Modeling the Perspective of Journalists, Fact-Checkers, Social Media Platforms, Policy Makers, and the Society
With the emergence of the COVID-19 pandemic, the political and the medical aspects of disinformation merged as the problem got elevated to a whole new level to become the first global infodemic.
Investigating Efficiently Extending Transformers for Long Input Summarization
While large pretrained Transformer models have proven highly capable at tackling natural language tasks, handling long sequence inputs continues to be a significant challenge.
An In-Depth Exploration of Person Re-Identification and Gait Recognition in Cloth-Changing Conditions
For the cloth-changing problem, video-based ReID is rarely studied due to the lack of a suitable cloth-changing benchmark, and gait recognition is often researched under controlled conditions.