Search Results for author: Jerry Chee

Found 7 papers, 4 papers with code

QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks

1 code implementation • 6 Feb 2024 • Albert Tseng, Jerry Chee, Qingyao Sun, Volodymyr Kuleshov, Christopher De Sa

Second, QuIP# uses vector quantization techniques to take advantage of the ball-shaped sub-Gaussian distribution that incoherent weights possess: specifically, we introduce a set of hardware-efficient codebooks based on the highly symmetric $E_8$ lattice, which achieves the optimal 8-dimension unit ball packing.

Quantization

418

Paper
Code

QuIP: 2-Bit Quantization of Large Language Models With Guarantees

1 code implementation • NeurIPS 2023 • Jerry Chee, Yaohui Cai, Volodymyr Kuleshov, Christopher De Sa

This work studies post-training parameter quantization in large language models (LLMs).

Quantization

300

Paper
Code

Performance optimizations on deep noise suppression models

no code implementations • 8 Oct 2021 • Jerry Chee, Sebastian Braun, Vishak Gopal, Ross Cutler

We study the role of magnitude structured pruning as an architecture search to speed up the inference time of a deep noise suppression (DNS) model.

Paper
Add Code

Model Preserving Compression for Neural Networks

1 code implementation • 30 Jul 2021 • Jerry Chee, Megan Renz, Anil Damle, Christopher De Sa

After training complex deep learning models, a common task is to compress the model to reduce compute and storage demands.

Network Pruning

Paper
Code

How Low Can We Go: Trading Memory for Error in Low-Precision Training

1 code implementation • ICLR 2022 • Chengrun Yang, Ziyang Wu, Jerry Chee, Christopher De Sa, Madeleine Udell

Low-precision arithmetic trains deep learning models using less energy, less memory and less time.

Meta-Learning

Paper
Code

Understanding and Detecting Convergence for Stochastic Gradient Descent with Momentum

no code implementations • 27 Aug 2020 • Jerry Chee, Ping Li

We construct a statistical diagnostic test for convergence to the stationary phase using the inner product between successive gradients and demonstrate that the proposed diagnostic works well.

Stochastic Optimization

Paper
Add Code

Convergence diagnostics for stochastic gradient descent with constant step size

no code implementations • 17 Oct 2017 • Jerry Chee, Panos Toulis

During the transient phase the procedure converges towards a region of interest, and during the stationary phase the procedure oscillates in that region, commonly around a single point.

Stochastic Optimization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.