Search Results for author: Zachary Ankner

Found 5 papers, 1 papers with code

Hydra: Sequentially-Dependent Draft Heads for Medusa Decoding

no code implementations • 7 Feb 2024 • Zachary Ankner, Rishab Parthasarathy, Aniruddha Nrusimha, Christopher Rinard, Jonathan Ragan-Kelley, William Brandon

In this work, we propose Hydra heads, a sequentially dependent, drop-in replacement for standard draft heads that significantly improves speculation accuracy.

Paper
Add Code

Striped Attention: Faster Ring Attention for Causal Transformers

1 code implementation • 15 Nov 2023 • William Brandon, Aniruddha Nrusimha, Kevin Qian, Zachary Ankner, Tian Jin, Zhiye Song, Jonathan Ragan-Kelley

In experiments running Striped Attention on A100 GPUs and TPUv4s, we are able to achieve up to 1. 45x end-to-end throughput improvements over the original Ring Attention algorithm on causal transformer training at a sequence length of 256k.

Paper
Code

Dynamic Masking Rate Schedules for MLM Pretraining

no code implementations • 24 May 2023 • Zachary Ankner, Naomi Saphra, Davis Blalock, Jonathan Frankle, Matthew L. Leavitt

Most works on transformers trained with the Masked Language Modeling (MLM) objective use the original BERT model's fixed masking rate of 15%.

Language Modelling Masked Language Modeling +1

Paper
Add Code

The Effect of Data Dimensionality on Neural Network Prunability

no code implementations • 1 Dec 2022 • Zachary Ankner, Alex Renda, Gintare Karolina Dziugaite, Jonathan Frankle, Tian Jin

Practitioners prune neural networks for efficiency gains and generalization improvements, but few scrutinize the factors determining the prunability of a neural network the maximum fraction of weights that pruning can remove without compromising the model's test accuracy.

Paper
Add Code

3D Neural Field Generation using Triplane Diffusion

no code implementations • CVPR 2023 • J. Ryan Shue, Eric Ryan Chan, Ryan Po, Zachary Ankner, Jiajun Wu, Gordon Wetzstein

Diffusion models have emerged as the state-of-the-art for image generation, among other tasks.

3D Generation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.