no code implementations • 17 Feb 2025 • Tian Jin, Ellie Y. Cheng, Zack Ankner, Nikunj Saunshi, Blake M. Elias, Amir Yazdanbakhsh, Jonathan Ragan-Kelley, Suvinay Subramanian, Michael Carbin
We present PASTA, a learning-based system that teaches LLMs to identify semantic independence and express parallel decoding opportunities in their own responses.
no code implementations • 18 Nov 2024 • Mathew Jacob, Erik Lindgren, Matei Zaharia, Michael Carbin, Omar Khattab, Andrew Drozdov
Rerankers, typically cross-encoders, are often used to re-score the documents retrieved by cheaper initial IR systems.
no code implementations • 5 Nov 2024 • Quinn Leng, Jacob Portes, Sam Havens, Matei Zaharia, Michael Carbin
Can these new long context models improve RAG performance?
no code implementations • 21 Aug 2024 • Ellie Y. Cheng, Eric Atkinson, Guillaume Baudart, Louis Mandel, Michael Carbin
In this work, we present inference plans, a programming interface that enables developers to control the partitioning of random variables during hybrid particle filtering.
no code implementations • 21 Jul 2024 • Logan Weber, Jesse Michel, Alex Renda, Michael Carbin
Alternatively, language models trained on a large dataset including many programs can consume program text, to act as a neural surrogate.
1 code implementation • 27 Mar 2024 • Elliot Bolton, Abhinav Venigalla, Michihiro Yasunaga, David Hall, Betty Xiong, Tony Lee, Roxana Daneshjou, Jonathan Frankle, Percy Liang, Michael Carbin, Christopher D. Manning
Models such as GPT-4 and Med-PaLM 2 have demonstrated impressive performance on a wide variety of biomedical NLP tasks.
no code implementations • 7 Oct 2023 • Tian Jin, Nolan Clement, Xin Dong, Vaishnavh Nagarajan, Michael Carbin, Jonathan Ragan-Kelley, Gintare Karolina Dziugaite
We study two natural scaling techniques -- weight pruning and simply training a smaller or larger model, which we refer to as dense scaling -- and their effects on two core capabilities of LLMs: (a) recalling facts presented during pre-training and (b) processing information presented in-context during inference.
no code implementations • 21 Sep 2023 • Alex Renda, Yi Ding, Michael Carbin
We first characterize the proportion of data to sample from each region of a program's input space (corresponding to different execution paths of the program) based on the complexity of learning a surrogate of the corresponding execution path.
no code implementations • 9 Mar 2023 • Cambridge Yang, Michael Littman, Michael Carbin
In particular, for the analysis that considers only sample complexity, we prove that if an objective given as an oracle is uniformly continuous, then it is PAC-learnable.
no code implementations • 10 Dec 2022 • Yi Ding, Aijia Gao, Thibaud Ryden, Kaushik Mitra, Sukumar Kalmanje, Yanai Golany, Michael Carbin, Henry Hoffmann
While it is tempting to use prior machine learning techniques for predicting job duration, we find that the structure of the maintenance job scheduling problem creates a unique challenge.
no code implementations • 25 Oct 2022 • Tian Jin, Michael Carbin, Daniel M. Roy, Jonathan Frankle, Gintare Karolina Dziugaite
Pruning models in this over-parameterized regime leads to a contradiction -- while theory predicts that reducing model size harms generalization, pruning to a range of sparsities nonetheless improves it.
no code implementations • 22 Apr 2022 • Hyunji Kim, Ahsan Pervaiz, Henry Hoffmann, Michael Carbin, Yi Ding
Such solutions monitor past system executions to learn the system's behavior under different hardware resource allocations before dynamically tuning resources to optimize the application execution.
no code implementations • 11 Apr 2022 • Yi Ding, Alex Renda, Ahsan Pervaiz, Michael Carbin, Henry Hoffmann
Our evaluation shows that compared to the state-of-the-art SEML approach in computer systems optimization, Cello improves latency by 1. 19X for minimizing latency under a power constraint, and improves energy by 1. 18X for minimizing energy under a latency constraint.
1 code implementation • 12 Dec 2021 • Alex Renda, Yi Ding, Michael Carbin
With surrogate adaptation, programmers develop a surrogate of a program then retrain that surrogate on a different task.
no code implementations • 24 Nov 2021 • Cambridge Yang, Michael Littman, Michael Carbin
In recent years, researchers have made significant progress in devising reinforcement-learning algorithms for optimizing linear temporal logic (LTL) objectives and LTL-like objectives.
no code implementations • AAAI Workshop CLeaR 2022 • Cambridge Yang, Michael Littman, Michael Carbin
In recent years, researchers have made significant progress in devising reinforcement-learning algorithms for optimizing linear temporal logic (LTL) objectives and LTL-like objectives.
no code implementations • 30 Apr 2021 • Rajiv Movva, Jonathan Frankle, Michael Carbin
Magnitude pruning is a common, effective technique to identify sparse subnetworks at little cost to accuracy.
1 code implementation • CVPR 2021 • Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang
We extend the scope of LTH and question whether matching subnetworks still exist in pre-trained computer vision models, that enjoy the same downstream transfer performance.
2 code implementations • 8 Oct 2020 • Alex Renda, Yishen Chen, Charith Mendis, Michael Carbin
In this paper we present DiffTune, a system for learning the parameters of x86 basic block CPU simulators from coarse-grained end-to-end measurements.
no code implementations • ICLR 2021 • Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin
Recent work has explored the possibility of pruning neural networks at initialization.
2 code implementations • NeurIPS 2020 • Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Zhangyang Wang, Michael Carbin
For a range of downstream tasks, we indeed find matching subnetworks at 40% to 90% sparsity.
1 code implementation • 15 Jul 2020 • Benjamin Sherman, Jesse Michel, Michael Carbin
Deep learning is moving towards increasingly sophisticated optimization objectives that employ higher-order functions, such as integration, continuous optimization, and root-finding.
Programming Languages Logic in Computer Science D.3.1; F.3.2
no code implementations • 18 Jun 2020 • Jonathan S. Rosenfeld, Jonathan Frankle, Michael Carbin, Nir Shavit
We show that the error of iteratively magnitude-pruned networks empirically follows a scaling law with interpretable coefficients that depend on the architecture and task.
no code implementations • 7 May 2020 • Riyadh Baghdadi, Abdelkader Nadir Debbagh, Kamel Abdous, Fatima Zohra Benhamida, Alex Renda, Jonathan Elliott Frankle, Michael Carbin, Saman Amarasinghe
In this paper, we demonstrate a compiler that can optimize sparse and recurrent neural networks, both of which are currently outside of the scope of existing neural network compilers (sparse neural networks here stand for networks that can be accelerated with sparse tensor algebra techniques).
2 code implementations • ICLR 2020 • Alex Renda, Jonathan Frankle, Michael Carbin
Learning rate rewinding (which we propose) trains the unpruned weights from their final values using the same learning rate schedule as weight rewinding.
2 code implementations • ICML 2020 • Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin
We study whether a neural network optimizes to the same, linearly connected minimum under different samples of SGD noise (e. g., random data order and augmentation).
1 code implementation • NeurIPS 2019 • Charith Mendis, Cambridge Yang, Yewen Pu, Dr.Saman Amarasinghe, Michael Carbin
We show that the learnt policy produces a vectorization scheme which is better than industry standard compiler heuristics both in terms of static measures and runtime performance.
no code implementations • 25 Sep 2019 • Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin
We observe that these subnetworks match the accuracy of the full network only when two SGD runs for the same subnetwork are connected by linear paths with the no change in test error.
3 code implementations • 5 Mar 2019 • Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin
With this change, it finds small subnetworks of deeper networks (e. g., 80% sparsity on Resnet-50) that can complete the training process to match the accuracy of the original network on more challenging tasks (e. g., ImageNet).
3 code implementations • 21 Aug 2018 • Charith Mendis, Alex Renda, Saman Amarasinghe, Michael Carbin
Predicting the number of clock cycles a processor takes to execute a block of assembly instructions in steady state (the throughput) is important for both compiler designers and performance engineers.
no code implementations • 20 Mar 2018 • Justin Gottschlich, Armando Solar-Lezama, Nesime Tatbul, Michael Carbin, Martin Rinard, Regina Barzilay, Saman Amarasinghe, Joshua B. Tenenbaum, Tim Mattson
In this position paper, we describe our vision of the future of machine programming through a categorical examination of three pillars of research.
24 code implementations • ICLR 2019 • Jonathan Frankle, Michael Carbin
Based on these results, we articulate the "lottery ticket hypothesis:" dense, randomly-initialized, feed-forward networks contain subnetworks ("winning tickets") that - when trained in isolation - reach test accuracy comparable to the original network in a similar number of iterations.