Search Results for author: Yu Emma Wang

Found 8 papers, 2 papers with code

Hadamard Domain Training with Integers for Class Incremental Quantized Learning

no code implementations • 5 Oct 2023 • Martin Schiemer, Clemens JS Schaefer, Jayden Parker Vap, Mark James Horeni, Yu Emma Wang, Juan Ye, Siddharth Joshi

In this paper, we propose a technique that leverages inexpensive Hadamard transforms to enable low-precision training with only integer matrix multiplications.

Class Incremental Learning Human Activity Recognition +2

Paper
Add Code

Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision Post-Training Quantization

no code implementations • 8 Jun 2023 • Clemens JS Schaefer, Navid Lambert-Shirzad, Xiaofan Zhang, Chiachen Chou, Tom Jablin, Jian Li, Elfie Guo, Caitlin Stanton, Siddharth Joshi, Yu Emma Wang

To address this challenge, we propose a mixed-precision post training quantization (PTQ) approach that assigns different numerical precisions to tensors in a network based on their specific needs, for a reduced memory footprint and improved latency while preserving model accuracy.

Quantization

Paper
Add Code

Mixed Precision Post Training Quantization of Neural Networks with Sensitivity Guided Search

no code implementations • 2 Feb 2023 • Clemens JS Schaefer, Elfie Guo, Caitlin Stanton, Xiaofan Zhang, Tom Jablin, Navid Lambert-Shirzad, Jian Li, Chiachen Chou, Siddharth Joshi, Yu Emma Wang

In this paper, we propose a method to efficiently determine quantization configurations of different tensors in ML models using post-training mixed precision quantization.

Quantization

Paper
Add Code

AutoDistill: an End-to-End Framework to Explore and Distill Hardware-Efficient Language Models

no code implementations • 21 Jan 2022 • Xiaofan Zhang, Zongwei Zhou, Deming Chen, Yu Emma Wang

By evaluating on SQuAD, a model found by AutoDistill achieves an 88. 4% F1 score with 22. 8M parameters, which reduces parameters by more than 62% while maintaining higher accuracy than DistillBERT, TinyBERT, and NAS-BERT.

Bayesian Optimization Knowledge Distillation +2

Paper
Add Code

GLaM: Efficient Scaling of Language Models with Mixture-of-Experts

no code implementations • 13 Dec 2021 • Nan Du, Yanping Huang, Andrew M. Dai, Simon Tong, Dmitry Lepikhin, Yuanzhong Xu, Maxim Krikun, Yanqi Zhou, Adams Wei Yu, Orhan Firat, Barret Zoph, Liam Fedus, Maarten Bosma, Zongwei Zhou, Tao Wang, Yu Emma Wang, Kellie Webster, Marie Pellat, Kevin Robinson, Kathleen Meier-Hellstern, Toju Duke, Lucas Dixon, Kun Zhang, Quoc V Le, Yonghui Wu, Zhifeng Chen, Claire Cui

Scaling language models with more data, compute and parameters has driven significant progress in natural language processing.

Ranked #10 on Language Modelling on LAMBADA

Common Sense Reasoning In-Context Learning +2

Paper
Add Code

Exploring the limits of Concurrency in ML Training on Google TPUs

no code implementations • 7 Nov 2020 • Sameer Kumar, James Bradbury, Cliff Young, Yu Emma Wang, Anselm Levskaya, Blake Hechtman, Dehao Chen, HyoukJoong Lee, Mehmet Deveci, Naveen Kumar, Pankaj Kanwar, Shibo Wang, Skye Wanderman-Milne, Steve Lacy, Tao Wang, Tayo Oguntebi, Yazhou Zu, Yuanzhong Xu, Andy Swing

Recent results in language understanding using neural networks have required training hardware of unprecedentedscale, with thousands of chips cooperating on a single training run.

Paper
Add Code

Exploiting Parallelism Opportunities with Deep Learning Frameworks

1 code implementation • 13 Aug 2019 • Yu Emma Wang, Carole-Jean Wu, Xiaodong Wang, Kim Hazelwood, David Brooks

State-of-the-art machine learning frameworks support a wide variety of design features to enable a flexible machine learning programming interface and to ease the programmability burden on machine learning developers.

BIG-bench Machine Learning

Paper
Code

Benchmarking TPU, GPU, and CPU Platforms for Deep Learning

1 code implementation • 24 Jul 2019 • Yu Emma Wang, Gu-Yeon Wei, David Brooks

Training deep learning models is compute-intensive and there is an industry-wide trend towards hardware specialization to improve performance.

Benchmarking

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.