Search Results for author: Yu Emma Wang

Found 8 papers, 2 papers with code

Hadamard Domain Training with Integers for Class Incremental Quantized Learning

no code implementations5 Oct 2023 Martin Schiemer, Clemens JS Schaefer, Jayden Parker Vap, Mark James Horeni, Yu Emma Wang, Juan Ye, Siddharth Joshi

In this paper, we propose a technique that leverages inexpensive Hadamard transforms to enable low-precision training with only integer matrix multiplications.

class-incremental learning Class Incremental Learning +3

Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision Post-Training Quantization

no code implementations8 Jun 2023 Clemens JS Schaefer, Navid Lambert-Shirzad, Xiaofan Zhang, Chiachen Chou, Tom Jablin, Jian Li, Elfie Guo, Caitlin Stanton, Siddharth Joshi, Yu Emma Wang

To address this challenge, we propose a mixed-precision post training quantization (PTQ) approach that assigns different numerical precisions to tensors in a network based on their specific needs, for a reduced memory footprint and improved latency while preserving model accuracy.

Quantization

Mixed Precision Post Training Quantization of Neural Networks with Sensitivity Guided Search

no code implementations2 Feb 2023 Clemens JS Schaefer, Elfie Guo, Caitlin Stanton, Xiaofan Zhang, Tom Jablin, Navid Lambert-Shirzad, Jian Li, Chiachen Chou, Siddharth Joshi, Yu Emma Wang

In this paper, we propose a method to efficiently determine quantization configurations of different tensors in ML models using post-training mixed precision quantization.

Quantization

AutoDistill: an End-to-End Framework to Explore and Distill Hardware-Efficient Language Models

no code implementations21 Jan 2022 Xiaofan Zhang, Zongwei Zhou, Deming Chen, Yu Emma Wang

By evaluating on SQuAD, a model found by AutoDistill achieves an 88. 4% F1 score with 22. 8M parameters, which reduces parameters by more than 62% while maintaining higher accuracy than DistillBERT, TinyBERT, and NAS-BERT.

Bayesian Optimization Knowledge Distillation +2

Exploring the limits of Concurrency in ML Training on Google TPUs

no code implementations7 Nov 2020 Sameer Kumar, James Bradbury, Cliff Young, Yu Emma Wang, Anselm Levskaya, Blake Hechtman, Dehao Chen, HyoukJoong Lee, Mehmet Deveci, Naveen Kumar, Pankaj Kanwar, Shibo Wang, Skye Wanderman-Milne, Steve Lacy, Tao Wang, Tayo Oguntebi, Yazhou Zu, Yuanzhong Xu, Andy Swing

Recent results in language understanding using neural networks have required training hardware of unprecedentedscale, with thousands of chips cooperating on a single training run.

Exploiting Parallelism Opportunities with Deep Learning Frameworks

1 code implementation13 Aug 2019 Yu Emma Wang, Carole-Jean Wu, Xiaodong Wang, Kim Hazelwood, David Brooks

State-of-the-art machine learning frameworks support a wide variety of design features to enable a flexible machine learning programming interface and to ease the programmability burden on machine learning developers.

BIG-bench Machine Learning

Benchmarking TPU, GPU, and CPU Platforms for Deep Learning

1 code implementation24 Jul 2019 Yu Emma Wang, Gu-Yeon Wei, David Brooks

Training deep learning models is compute-intensive and there is an industry-wide trend towards hardware specialization to improve performance.

Benchmarking

Cannot find the paper you are looking for? You can Submit a new open access paper.