Search Results for author: Michael Carbin

Found 27 papers, 12 papers with code

BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text

1 code implementation • 27 Mar 2024 • Elliot Bolton, Abhinav Venigalla, Michihiro Yasunaga, David Hall, Betty Xiong, Tony Lee, Roxana Daneshjou, Jonathan Frankle, Percy Liang, Michael Carbin, Christopher D. Manning

Models such as GPT-4 and Med-PaLM 2 have demonstrated impressive performance on a wide variety of biomedical NLP tasks.

Language Modelling Medical Genetics +3

578

Paper
Code

The Cost of Down-Scaling Language Models: Fact Recall Deteriorates before In-Context Learning

no code implementations • 7 Oct 2023 • Tian Jin, Nolan Clement, Xin Dong, Vaishnavh Nagarajan, Michael Carbin, Jonathan Ragan-Kelley, Gintare Karolina Dziugaite

We study two natural scaling techniques -- weight pruning and simply training a smaller or larger model, which we refer to as dense scaling -- and their effects on two core capabilities of LLMs: (a) recalling facts presented during pre-training and (b) processing information presented in-context during inference.

In-Context Learning

Paper
Add Code

Turaco: Complexity-Guided Data Sampling for Training Neural Surrogates of Programs

no code implementations • 21 Sep 2023 • Alex Renda, Yi Ding, Michael Carbin

We first characterize the proportion of data to sample from each region of a program's input space (corresponding to different execution paths of the program) based on the complexity of learning a surrogate of the corresponding execution path.

Paper
Add Code

Computably Continuous Reinforcement-Learning Objectives are PAC-learnable

no code implementations • 9 Mar 2023 • Cambridge Yang, Michael Littman, Michael Carbin

In particular, for the analysis that considers only sample complexity, we prove that if an objective given as an oracle is uniformly continuous, then it is PAC-learnable.

General Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Acela: Predictable Datacenter-level Maintenance Job Scheduling

no code implementations • 10 Dec 2022 • Yi Ding, Aijia Gao, Thibaud Ryden, Kaushik Mitra, Sukumar Kalmanje, Yanai Golany, Michael Carbin, Henry Hoffmann

While it is tempting to use prior machine learning techniques for predicting job duration, we find that the structure of the maintenance job scheduling problem creates a unique challenge.

Scheduling

Paper
Add Code

Pruning's Effect on Generalization Through the Lens of Training and Regularization

no code implementations • 25 Oct 2022 • Tian Jin, Michael Carbin, Daniel M. Roy, Jonathan Frankle, Gintare Karolina Dziugaite

Pruning models in this over-parameterized regime leads to a contradiction -- while theory predicts that reducing model size harms generalization, pruning to a range of sparsities nonetheless improves it.

Paper
Add Code

SCOPE: Safe Exploration for Dynamic Computer Systems Optimization

no code implementations • 22 Apr 2022 • Hyunji Kim, Ahsan Pervaiz, Henry Hoffmann, Michael Carbin, Yi Ding

Such solutions monitor past system executions to learn the system's behavior under different hardware resource allocations before dynamically tuning resources to optimize the application execution.

Safe Exploration

Paper
Add Code

Cello: Efficient Computer Systems Optimization with Predictive Early Termination and Censored Regression

no code implementations • 11 Apr 2022 • Yi Ding, Alex Renda, Ahsan Pervaiz, Michael Carbin, Henry Hoffmann

Our evaluation shows that compared to the state-of-the-art SEML approach in computer systems optimization, Cello improves latency by 1. 19X for minimizing latency under a power constraint, and improves energy by 1. 18X for minimizing energy under a latency constraint.

regression

Paper
Add Code

Programming with Neural Surrogates of Programs

1 code implementation • 12 Dec 2021 • Alex Renda, Yi Ding, Michael Carbin

With surrogate adaptation, programmers develop a surrogate of a program then retrain that surrogate on a different task.

Paper
Code

On the (In)Tractability of Reinforcement Learning for LTL Objectives

no code implementations • 24 Nov 2021 • Cambridge Yang, Michael Littman, Michael Carbin

In recent years, researchers have made significant progress in devising reinforcement-learning algorithms for optimizing linear temporal logic (LTL) objectives and LTL-like objectives.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Reinforcement Learning with General LTL Objectives is Intractable

no code implementations • AAAI Workshop CLeaR 2022 • Cambridge Yang, Michael Littman, Michael Carbin

In recent years, researchers have made significant progress in devising reinforcement-learning algorithms for optimizing linear temporal logic (LTL) objectives and LTL-like objectives.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Studying the Consistency and Composability of Lottery Ticket Pruning Masks

no code implementations • 30 Apr 2021 • Rajiv Movva, Jonathan Frankle, Michael Carbin

Magnitude pruning is a common, effective technique to identify sparse subnetworks at little cost to accuracy.

Paper
Add Code

The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models

1 code implementation • CVPR 2021 • Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

We extend the scope of LTH and question whether matching subnetworks still exist in pre-trained computer vision models, that enjoy the same downstream transfer performance.

Paper
Code

DiffTune: Optimizing CPU Simulator Parameters with Learned Differentiable Surrogates

2 code implementations • 8 Oct 2020 • Alex Renda, Yishen Chen, Charith Mendis, Michael Carbin

In this paper we present DiffTune, a system for learning the parameters of x86 basic block CPU simulators from coarse-grained end-to-end measurements.

Scheduling

Paper
Code

Pruning Neural Networks at Initialization: Why are We Missing the Mark?

no code implementations • ICLR 2021 • Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin

Recent work has explored the possibility of pruning neural networks at initialization.

Paper
Add Code

The Lottery Ticket Hypothesis for Pre-trained BERT Networks

2 code implementations • NeurIPS 2020 • Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Zhangyang Wang, Michael Carbin

For a range of downstream tasks, we indeed find matching subnetworks at 40% to 90% sparsity.

Language Modelling Masked Language Modeling

136

Paper
Code

$λ_S$: Computable Semantics for Differentiable Programming with Higher-Order Functions and Datatypes

1 code implementation • 15 Jul 2020 • Benjamin Sherman, Jesse Michel, Michael Carbin

Deep learning is moving towards increasingly sophisticated optimization objectives that employ higher-order functions, such as integration, continuous optimization, and root-finding.

Programming Languages Logic in Computer Science D.3.1; F.3.2

Paper
Code

On the Predictability of Pruning Across Scales

no code implementations • 18 Jun 2020 • Jonathan S. Rosenfeld, Jonathan Frankle, Michael Carbin, Nir Shavit

We show that the error of iteratively magnitude-pruned networks empirically follows a scaling law with interpretable coefficients that depend on the architecture and task.

Paper
Add Code

TIRAMISU: A Polyhedral Compiler for Dense and Sparse Deep Learning

no code implementations • 7 May 2020 • Riyadh Baghdadi, Abdelkader Nadir Debbagh, Kamel Abdous, Fatima Zohra Benhamida, Alex Renda, Jonathan Elliott Frankle, Michael Carbin, Saman Amarasinghe

In this paper, we demonstrate a compiler that can optimize sparse and recurrent neural networks, both of which are currently outside of the scope of existing neural network compilers (sparse neural networks here stand for networks that can be accelerated with sparse tensor algebra techniques).

Paper
Add Code

Comparing Rewinding and Fine-tuning in Neural Network Pruning

2 code implementations • ICLR 2020 • Alex Renda, Jonathan Frankle, Michael Carbin

Learning rate rewinding (which we propose) trains the unpruned weights from their final values using the same learning rate schedule as weight rewinding.

Network Pruning

Paper
Code

Linear Mode Connectivity and the Lottery Ticket Hypothesis

2 code implementations • ICML 2020 • Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin

We study whether a neural network optimizes to the same, linearly connected minimum under different samples of SGD noise (e. g., random data order and augmentation).

Linear Mode Connectivity

619

Paper
Code

Compiler Auto-Vectorization with Imitation Learning

1 code implementation • NeurIPS 2019 • Charith Mendis, Cambridge Yang, Yewen Pu, Dr.Saman Amarasinghe, Michael Carbin

We show that the learnt policy produces a vectorization scheme which is better than industry standard compiler heuristics both in terms of static measures and runtime performance.

Imitation Learning

Paper
Code

Mode Connectivity and Sparse Neural Networks

no code implementations • 25 Sep 2019 • Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin

We observe that these subnetworks match the accuracy of the full network only when two SGD runs for the same subnetwork are connected by linear paths with the no change in test error.

Paper
Add Code

Stabilizing the Lottery Ticket Hypothesis

3 code implementations • 5 Mar 2019 • Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin

With this change, it finds small subnetworks of deeper networks (e. g., 80% sparsity on Resnet-50) that can complete the training process to match the accuracy of the original network on more challenging tasks (e. g., ImageNet).

619

Paper
Code

Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks

3 code implementations • 21 Aug 2018 • Charith Mendis, Alex Renda, Saman Amarasinghe, Michael Carbin

Predicting the number of clock cycles a processor takes to execute a block of assembly instructions in steady state (the throughput) is important for both compiler designers and performance engineers.

143

Paper
Code

The Three Pillars of Machine Programming

no code implementations • 20 Mar 2018 • Justin Gottschlich, Armando Solar-Lezama, Nesime Tatbul, Michael Carbin, Martin Rinard, Regina Barzilay, Saman Amarasinghe, Joshua B. Tenenbaum, Tim Mattson

In this position paper, we describe our vision of the future of machine programming through a categorical examination of three pillars of research.

BIG-bench Machine Learning Position

Paper
Add Code

The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

24 code implementations • ICLR 2019 • Jonathan Frankle, Michael Carbin

Based on these results, we articulate the "lottery ticket hypothesis:" dense, randomly-initialized, feed-forward networks contain subnetworks ("winning tickets") that - when trained in isolation - reach test accuracy comparable to the original network in a similar number of iterations.

Network Pruning

703

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.