8k
98 papers with code • 0 benchmarks • 0 datasets
Benchmarks
These leaderboards are used to track progress in 8k
Most implemented papers
Large Batch Training of Convolutional Networks
Using LARS, we scaled Alexnet up to a batch size of 8K, and Resnet-50 to a batch size of 32K without loss in accuracy.
Adaptive Attention Span in Transformers
We propose a novel self-attention mechanism that can learn its optimal attention span.
Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting
Since convolutional layers of the neural network only need to operate on low-resolution inputs and outputs, the cost of memory and computing power is thus well suppressed.
Hyena Hierarchy: Towards Larger Convolutional Language Models
Recent advances in deep learning have relied heavily on the use of large Transformers due to their ability to learn at scale.
CoQA: A Conversational Question Answering Challenge
Humans gather information by engaging in conversations involving a series of interconnected questions and answers.
StarCoder: may the source be with you!
The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase: 15. 5B parameter models with 8K context length, infilling capabilities and fast large-batch inference enabled by multi-query attention.
Beyond Narrative Description: Generating Poetry from Images by Multi-Adversarial Training
Extensive experiments are conducted with 8K images, among which 1. 5K image are randomly picked for evaluation.
ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic
On this basis, we propose a new solution pipeline -- ClassSR that combines classification and SR in a unified framework.
Collapsible Linear Blocks for Super-Efficient Super Resolution
Our results highlight the challenges faced by super resolution on AI accelerators and demonstrate that SESR is significantly faster (e. g., 6x-8x higher FPS) than existing models on mobile-NPU.
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
First, we use synthetic language modeling tasks to understand the gap between SSMs and attention.