Search Results for author: Felix Chern

Found 4 papers, 1 papers with code

The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in Transformers

no code implementations12 Oct 2022 Zonglin Li, Chong You, Srinadh Bhojanapalli, Daliang Li, Ankit Singh Rawat, Sashank J. Reddi, Ke Ye, Felix Chern, Felix Yu, Ruiqi Guo, Sanjiv Kumar

This paper studies the curious phenomenon for machine learning models with Transformer architectures that their activation maps are sparse.

TPU-KNN: K Nearest Neighbor Search at Peak FLOP/s

no code implementations28 Jun 2022 Felix Chern, Blake Hechtman, Andy Davis, Ruiqi Guo, David Majnemer, Sanjiv Kumar

This paper presents a novel nearest neighbor search algorithm achieving TPU (Google Tensor Processing Unit) peak performance, outperforming state-of-the-art GPU algorithms with similar level of recall.

New Loss Functions for Fast Maximum Inner Product Search

no code implementations ICLR 2020 Ruiqi Guo, Quan Geng, David Simcha, Felix Chern, Phil Sun, Sanjiv Kumar

In this work, we focus directly on minimizing error in inner product approximation and derive a new class of quantization loss functions.

Benchmarking Quantization

Accelerating Large-Scale Inference with Anisotropic Vector Quantization

3 code implementations ICML 2020 Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, Sanjiv Kumar

Based on the observation that for a given query, the database points that have the largest inner products are more relevant, we develop a family of anisotropic quantization loss functions.

Benchmarking Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.