Search Results for author: Maxwell Horton

Found 16 papers, 9 papers with code

QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache

no code implementations5 Feb 2025 Rishabh Tiwari, Haocheng Xi, Aditya Tomar, Coleman Hooper, Sehoon Kim, Maxwell Horton, Mahyar Najibi, Michael W. Mahoney, Kurt Keutzer, Amir Gholami

Large Language Models (LLMs) are increasingly being deployed on edge devices for long-context settings, creating a growing need for fast and efficient long-context inference.

SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators

no code implementations14 Oct 2024 Rasoul Shafipour, David Harrison, Maxwell Horton, Jeffrey Marker, Houman Bedayat, Sachin Mehta, Mohammad Rastegari, Mahyar Najibi, Saman Naderiparizi

Large Language Models (LLMs) have transformed natural language processing, but face significant challenges in widespread deployment due to their high runtime cost.

KV Prediction for Improved Time to First Token

1 code implementation10 Oct 2024 Maxwell Horton, Qingqing Cao, Chenfan Sun, Yanzi Jin, Sachin Mehta, Mohammad Rastegari, Moin Nabi

In our method, a small auxiliary model is used to process the prompt and produce an approximation of the KV cache used by a base model.

Code Completion HumanEval +2

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

1 code implementation24 Apr 2024 Sachin Mehta, Maxwell Horton, Fartash Faghri, Mohammad Hossein Sekhavat, Mahyar Najibi, Mehrdad Farajtabar, Oncel Tuzel, Mohammad Rastegari

Contrastive learning has emerged as a transformative method for learning effective visual representations through the alignment of image and text embeddings.

Contrastive Learning

Diffusion Models as Masked Audio-Video Learners

no code implementations5 Oct 2023 Elvis Nunez, Yanzi Jin, Mohammad Rastegari, Sachin Mehta, Maxwell Horton

Over the past several years, the synchronization between audio and visual signals has been leveraged to learn richer audio-visual representations.

Audio Classification Contrastive Learning

Bytes Are All You Need: Transformers Operating Directly On File Bytes

2 code implementations31 May 2023 Maxwell Horton, Sachin Mehta, Ali Farhadi, Mohammad Rastegari

Compared to Perceiver IO, our model requires absolutely no modality-specific processing at inference time, and uses an order of magnitude fewer parameters at equivalent accuracy on ImageNet.

All Audio Classification +3

RangeAugment: Efficient Online Augmentation with Range Learning

1 code implementation20 Dec 2022 Sachin Mehta, Saeid Naderiparizi, Fartash Faghri, Maxwell Horton, Lailin Chen, Ali Farhadi, Oncel Tuzel, Mohammad Rastegari

To answer the open question on the importance of magnitude ranges for each augmentation operation, we introduce RangeAugment that allows us to efficiently learn the range of magnitudes for individual as well as composite augmentation operations.

Knowledge Distillation object-detection +3

LCS: Learning Compressible Subspaces for Adaptive Network Compression at Inference Time

1 code implementation8 Oct 2021 Elvis Nunez, Maxwell Horton, Anish Prabhu, Anurag Ranjan, Ali Farhadi, Mohammad Rastegari

Our models require no retraining, thus our subspace of models can be deployed entirely on-device to allow adaptive network compression at inference time.

Quantization

Learning Neural Network Subspaces

1 code implementation20 Feb 2021 Mitchell Wortsman, Maxwell Horton, Carlos Guestrin, Ali Farhadi, Mohammad Rastegari

Recent observations have advanced our understanding of the neural network optimization landscape, revealing the existence of (1) paths of high accuracy containing diverse solutions and (2) wider minima offering improved performance.

Layer-Wise Data-Free CNN Compression

no code implementations18 Nov 2020 Maxwell Horton, Yanzi Jin, Ali Farhadi, Mohammad Rastegari

We also show how to precondition the network to improve the accuracy of our layer-wise compression method.

Quantization

Label Refinery: Improving ImageNet Classification through Label Progression

4 code implementations7 May 2018 Hessam Bagherinezhad, Maxwell Horton, Mohammad Rastegari, Ali Farhadi

Among the three main components (data, labels, and models) of any supervised learning system, data and models have been the main subjects of active research.

Classification General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.