63 papers with code • 0 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

Large Batch Training of Convolutional Networks

facebookresearch/vissl 13 Aug 2017

Using LARS, we scaled Alexnet up to a batch size of 8K, and Resnet-50 to a batch size of 32K without loss in accuracy.

Adaptive Attention Span in Transformers

facebookresearch/adaptive-span ACL 2019

We propose a novel self-attention mechanism that can learn its optimal attention span.

Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting

Atlas200dk/sample-imageinpainting-HiFill CVPR 2020

Since convolutional layers of the neural network only need to operate on low-resolution inputs and outputs, the cost of memory and computing power is thus well suppressed.

Hyena Hierarchy: Towards Larger Convolutional Language Models

hazyresearch/safari 21 Feb 2023

Recent advances in deep learning have relied heavily on the use of large Transformers due to their ability to learn at scale.

CoQA: A Conversational Question Answering Challenge

stanfordnlp/coqa-baselines TACL 2019

Humans gather information by engaging in conversations involving a series of interconnected questions and answers.

StarCoder: may the source be with you!

bigcode-project/starcoder 9 May 2023

The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase: 15. 5B parameter models with 8K context length, infilling capabilities and fast large-batch inference enabled by multi-query attention.

Beyond Narrative Description: Generating Poetry from Images by Multi-Adversarial Training

researchmm/img2poem 23 Apr 2018

Extensive experiments are conducted with 8K images, among which 1. 5K image are randomly picked for evaluation.

ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic

Xiangtaokong/ClassSR CVPR 2021

On this basis, we propose a new solution pipeline -- ClassSR that combines classification and SR in a unified framework.

Collapsible Linear Blocks for Super-Efficient Super Resolution

ARM-software/sesr 17 Mar 2021

Our results highlight the challenges faced by super resolution on AI accelerators and demonstrate that SESR is significantly faster (e. g., 6x-8x higher FPS) than existing models on mobile-NPU.

Hungry Hungry Hippos: Towards Language Modeling with State Space Models

hazyresearch/h3 28 Dec 2022

First, we use synthetic language modeling tasks to understand the gap between SSMs and attention.