Search Results for author: Trevor Gale

Found 8 papers, 7 papers with code

The State of Sparsity in Deep Neural Networks

6 code implementations25 Feb 2019 Trevor Gale, Erich Elsen, Sara Hooker

We rigorously evaluate three state-of-the-art techniques for inducing sparsity in deep neural networks on two large-scale learning tasks: Transformer trained on WMT 2014 English-to-German, and ResNet-50 trained on ImageNet.

Model Compression Sparse Learning

Fast Sparse ConvNets

4 code implementations CVPR 2020 Erich Elsen, Marat Dukhan, Trevor Gale, Karen Simonyan

Equipped with our efficient implementation of sparse primitives, we show that sparse versions of MobileNet v1, MobileNet v2 and EfficientNet architectures substantially outperform strong dense baselines on the efficiency-accuracy curve.

MegaBlocks: Efficient Sparse Training with Mixture-of-Experts

2 code implementations29 Nov 2022 Trevor Gale, Deepak Narayanan, Cliff Young, Matei Zaharia

We present MegaBlocks, a system for efficient Mixture-of-Experts (MoE) training on GPUs.

On the Opportunities and Risks of Foundation Models

2 code implementations16 Aug 2021 Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang

AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.

Transfer Learning

Rigging the Lottery: Making All Tickets Winners

8 code implementations ICML 2020 Utku Evci, Trevor Gale, Jacob Menick, Pablo Samuel Castro, Erich Elsen

There is a large body of work on training dense networks to yield sparse networks for inference, but this limits the size of the largest trainable sparse model to that of the largest trainable dense model.

Image Classification Language Modelling +1

Sparse GPU Kernels for Deep Learning

1 code implementation18 Jun 2020 Trevor Gale, Matei Zaharia, Cliff Young, Erich Elsen

In this work, we study sparse matrices from deep learning applications and identify favorable properties that can be exploited to accelerate computation.

dMath: Distributed Linear Algebra for DL

no code implementations19 Nov 2016 Steven Eliuk, Cameron Upright, Hars Vardhan, Stephen Walsh, Trevor Gale

The paper presents a parallel math library, dMath, that demonstrates leading scaling when using intranode, internode, and hybrid-parallelism for deep learning (DL).

Management Math

Cannot find the paper you are looking for? You can Submit a new open access paper.