Search Results for author: Michael Carbin

Found 27 papers, 12 papers with code

The Cost of Down-Scaling Language Models: Fact Recall Deteriorates before In-Context Learning

no code implementations7 Oct 2023 Tian Jin, Nolan Clement, Xin Dong, Vaishnavh Nagarajan, Michael Carbin, Jonathan Ragan-Kelley, Gintare Karolina Dziugaite

We study two natural scaling techniques -- weight pruning and simply training a smaller or larger model, which we refer to as dense scaling -- and their effects on two core capabilities of LLMs: (a) recalling facts presented during pre-training and (b) processing information presented in-context during inference.

In-Context Learning

Turaco: Complexity-Guided Data Sampling for Training Neural Surrogates of Programs

no code implementations21 Sep 2023 Alex Renda, Yi Ding, Michael Carbin

We first characterize the proportion of data to sample from each region of a program's input space (corresponding to different execution paths of the program) based on the complexity of learning a surrogate of the corresponding execution path.

Computably Continuous Reinforcement-Learning Objectives are PAC-learnable

no code implementations9 Mar 2023 Cambridge Yang, Michael Littman, Michael Carbin

In particular, for the analysis that considers only sample complexity, we prove that if an objective given as an oracle is uniformly continuous, then it is PAC-learnable.

General Reinforcement Learning reinforcement-learning +1

Acela: Predictable Datacenter-level Maintenance Job Scheduling

no code implementations10 Dec 2022 Yi Ding, Aijia Gao, Thibaud Ryden, Kaushik Mitra, Sukumar Kalmanje, Yanai Golany, Michael Carbin, Henry Hoffmann

While it is tempting to use prior machine learning techniques for predicting job duration, we find that the structure of the maintenance job scheduling problem creates a unique challenge.

Scheduling

Pruning's Effect on Generalization Through the Lens of Training and Regularization

no code implementations25 Oct 2022 Tian Jin, Michael Carbin, Daniel M. Roy, Jonathan Frankle, Gintare Karolina Dziugaite

Pruning models in this over-parameterized regime leads to a contradiction -- while theory predicts that reducing model size harms generalization, pruning to a range of sparsities nonetheless improves it.

SCOPE: Safe Exploration for Dynamic Computer Systems Optimization

no code implementations22 Apr 2022 Hyunji Kim, Ahsan Pervaiz, Henry Hoffmann, Michael Carbin, Yi Ding

Such solutions monitor past system executions to learn the system's behavior under different hardware resource allocations before dynamically tuning resources to optimize the application execution.

Safe Exploration

Cello: Efficient Computer Systems Optimization with Predictive Early Termination and Censored Regression

no code implementations11 Apr 2022 Yi Ding, Alex Renda, Ahsan Pervaiz, Michael Carbin, Henry Hoffmann

Our evaluation shows that compared to the state-of-the-art SEML approach in computer systems optimization, Cello improves latency by 1. 19X for minimizing latency under a power constraint, and improves energy by 1. 18X for minimizing energy under a latency constraint.

regression

Programming with Neural Surrogates of Programs

1 code implementation12 Dec 2021 Alex Renda, Yi Ding, Michael Carbin

With surrogate adaptation, programmers develop a surrogate of a program then retrain that surrogate on a different task.

On the (In)Tractability of Reinforcement Learning for LTL Objectives

no code implementations24 Nov 2021 Cambridge Yang, Michael Littman, Michael Carbin

In recent years, researchers have made significant progress in devising reinforcement-learning algorithms for optimizing linear temporal logic (LTL) objectives and LTL-like objectives.

reinforcement-learning Reinforcement Learning (RL)

Reinforcement Learning with General LTL Objectives is Intractable

no code implementations AAAI Workshop CLeaR 2022 Cambridge Yang, Michael Littman, Michael Carbin

In recent years, researchers have made significant progress in devising reinforcement-learning algorithms for optimizing linear temporal logic (LTL) objectives and LTL-like objectives.

reinforcement-learning Reinforcement Learning (RL)

Studying the Consistency and Composability of Lottery Ticket Pruning Masks

no code implementations30 Apr 2021 Rajiv Movva, Jonathan Frankle, Michael Carbin

Magnitude pruning is a common, effective technique to identify sparse subnetworks at little cost to accuracy.

The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models

1 code implementation CVPR 2021 Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

We extend the scope of LTH and question whether matching subnetworks still exist in pre-trained computer vision models, that enjoy the same downstream transfer performance.

DiffTune: Optimizing CPU Simulator Parameters with Learned Differentiable Surrogates

2 code implementations8 Oct 2020 Alex Renda, Yishen Chen, Charith Mendis, Michael Carbin

In this paper we present DiffTune, a system for learning the parameters of x86 basic block CPU simulators from coarse-grained end-to-end measurements.

Scheduling

$λ_S$: Computable Semantics for Differentiable Programming with Higher-Order Functions and Datatypes

1 code implementation15 Jul 2020 Benjamin Sherman, Jesse Michel, Michael Carbin

Deep learning is moving towards increasingly sophisticated optimization objectives that employ higher-order functions, such as integration, continuous optimization, and root-finding.

Programming Languages Logic in Computer Science D.3.1; F.3.2

On the Predictability of Pruning Across Scales

no code implementations18 Jun 2020 Jonathan S. Rosenfeld, Jonathan Frankle, Michael Carbin, Nir Shavit

We show that the error of iteratively magnitude-pruned networks empirically follows a scaling law with interpretable coefficients that depend on the architecture and task.

TIRAMISU: A Polyhedral Compiler for Dense and Sparse Deep Learning

no code implementations7 May 2020 Riyadh Baghdadi, Abdelkader Nadir Debbagh, Kamel Abdous, Fatima Zohra Benhamida, Alex Renda, Jonathan Elliott Frankle, Michael Carbin, Saman Amarasinghe

In this paper, we demonstrate a compiler that can optimize sparse and recurrent neural networks, both of which are currently outside of the scope of existing neural network compilers (sparse neural networks here stand for networks that can be accelerated with sparse tensor algebra techniques).

Comparing Rewinding and Fine-tuning in Neural Network Pruning

2 code implementations ICLR 2020 Alex Renda, Jonathan Frankle, Michael Carbin

Learning rate rewinding (which we propose) trains the unpruned weights from their final values using the same learning rate schedule as weight rewinding.

Network Pruning

Linear Mode Connectivity and the Lottery Ticket Hypothesis

2 code implementations ICML 2020 Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin

We study whether a neural network optimizes to the same, linearly connected minimum under different samples of SGD noise (e. g., random data order and augmentation).

Linear Mode Connectivity

Compiler Auto-Vectorization with Imitation Learning

1 code implementation NeurIPS 2019 Charith Mendis, Cambridge Yang, Yewen Pu, Dr.Saman Amarasinghe, Michael Carbin

We show that the learnt policy produces a vectorization scheme which is better than industry standard compiler heuristics both in terms of static measures and runtime performance.

Imitation Learning

Mode Connectivity and Sparse Neural Networks

no code implementations25 Sep 2019 Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin

We observe that these subnetworks match the accuracy of the full network only when two SGD runs for the same subnetwork are connected by linear paths with the no change in test error.

Stabilizing the Lottery Ticket Hypothesis

3 code implementations5 Mar 2019 Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin

With this change, it finds small subnetworks of deeper networks (e. g., 80% sparsity on Resnet-50) that can complete the training process to match the accuracy of the original network on more challenging tasks (e. g., ImageNet).

Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks

3 code implementations21 Aug 2018 Charith Mendis, Alex Renda, Saman Amarasinghe, Michael Carbin

Predicting the number of clock cycles a processor takes to execute a block of assembly instructions in steady state (the throughput) is important for both compiler designers and performance engineers.

The Three Pillars of Machine Programming

no code implementations20 Mar 2018 Justin Gottschlich, Armando Solar-Lezama, Nesime Tatbul, Michael Carbin, Martin Rinard, Regina Barzilay, Saman Amarasinghe, Joshua B. Tenenbaum, Tim Mattson

In this position paper, we describe our vision of the future of machine programming through a categorical examination of three pillars of research.

BIG-bench Machine Learning Position

The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

24 code implementations ICLR 2019 Jonathan Frankle, Michael Carbin

Based on these results, we articulate the "lottery ticket hypothesis:" dense, randomly-initialized, feed-forward networks contain subnetworks ("winning tickets") that - when trained in isolation - reach test accuracy comparable to the original network in a similar number of iterations.

Network Pruning

Cannot find the paper you are looking for? You can Submit a new open access paper.