no code implementations • 11 Feb 2023 • Daniel Kang, Xuechen Li, Ion Stoica, Carlos Guestrin, Matei Zaharia, Tatsunori Hashimoto
Recent advances in instruction-following large language models (LLMs) have led to dramatic improvements in a range of NLP tasks.
1 code implementation • 20 Feb 2021 • Mitchell Wortsman, Maxwell Horton, Carlos Guestrin, Ali Farhadi, Mohammad Rastegari
Recent observations have advanced our understanding of the neural network optimization landscape, revealing the existence of (1) paths of high accuracy containing diverse solutions and (2) wider minima offering improved performance.
1 code implementation • ICML 2020 • Tyler B. Johnson, Pulkit Agrawal, Haijie Gu, Carlos Guestrin
When using large-batch training to speed up stochastic gradient descent, learning rates must adapt to new batch sizes in order to maximize speed-ups and preserve model quality.
no code implementations • 18 Jun 2020 • Shuangfei Zhai, Walter Talbott, Miguel Angel Bautista, Carlos Guestrin, Josh M. Susskind
We introduce Set Distribution Networks (SDNs), a novel framework that learns to autoencode and freely generate sets.
1 code implementation • ICML 2020 • Emilien Dupont, Miguel Angel Bautista, Alex Colburn, Aditya Sankar, Carlos Guestrin, Josh Susskind, Qi Shan
We propose a framework for learning neural scene representations directly from images, without 3D supervision.
4 code implementations • ACL 2020 • Marco Tulio Ribeiro, Tongshuang Wu, Carlos Guestrin, Sameer Singh
Although measuring held-out accuracy has been the primary approach to evaluate generalization, it often overestimates the performance of NLP models, while alternative approaches for evaluating models either focus on individual tasks or on specific behaviors.
1 code implementation • NeurIPS 2019 • Shuangfei Zhai, Walter Talbott, Carlos Guestrin, Joshua M. Susskind
In contrast to a traditional view where the discriminator learns a constant function when reaching convergence, here we show that it can provide useful information for downstream tasks, e. g., feature extraction for classification.
no code implementations • 25 Sep 2019 • Shuangfei Zhai, Carlos Guestrin, Joshua M. Susskind
During inference time, the HBAE consists of two sampling steps: first a latent code for the input is sampled, and then this code is passed to the conditional generator to output a stochastic reconstruction.
no code implementations • 25 Sep 2019 • Tyler B. Johnson, Pulkit Agrawal, Haijie Gu, Carlos Guestrin
When using distributed training to speed up stochastic gradient descent, learning rates must adapt to new scales in order to maintain training effectiveness.
1 code implementation • ACL 2019 • Marco Tulio Ribeiro, Carlos Guestrin, Sameer Singh
Although current evaluation of question-answering systems treats predictions in isolation, we need to consider the relationship between predictions to measure true understanding.
no code implementations • 15 May 2019 • Chen Huang, Shuangfei Zhai, Walter Talbott, Miguel Angel Bautista, Shih-Yu Sun, Carlos Guestrin, Josh Susskind
In most machine learning training paradigms a fixed, often handcrafted, loss function is assumed to be a good proxy for an underlying evaluation metric.
no code implementations • NeurIPS 2018 • Tyler B. Johnson, Carlos Guestrin
In theory, importance sampling speeds up stochastic gradient algorithms for supervised learning by prioritizing training examples.
no code implementations • 20 Jul 2018 • Tyler B. Johnson, Carlos Guestrin
By reducing optimization to a sequence of smaller subproblems, working set algorithms achieve fast convergence times for many machine learning problems.
no code implementations • 11 Jul 2018 • Thierry Moreau, Tianqi Chen, Luis Vega, Jared Roesch, Eddie Yan, Lianmin Zheng, Josh Fromm, Ziheng Jiang, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
Specialized Deep Learning (DL) acceleration stacks, designed for a specific set of frameworks, model architectures, operators, and data types, offer the allure of high performance while sacrificing flexibility.
1 code implementation • ACL 2018 • Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin
Complex machine learning models for NLP are often brittle, making different predictions for input instances that are extremely similar semantically.
no code implementations • NeurIPS 2018 • Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
Efficient implementations of tensor operators, such as matrix multiplication and high dimensional convolution, are key enablers of effective deep learning systems.
1 code implementation • 1 May 2018 • Pouya Pezeshkpour, Carlos Guestrin, Sameer Singh
Matrix factorization is a well-studied task in machine learning for compactly representing large, noisy data.
1 code implementation • 12 Feb 2018 • Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
Experimental results show that TVM delivers performance across hardware back-ends that are competitive with state-of-the-art, hand-tuned libraries for low-power CPU, mobile GPU, and server-class GPUs.
no code implementations • ICML 2017 • Tyler B. Johnson, Carlos Guestrin
Coordinate descent (CD) is a scalable and simple algorithm for solving many optimization problems in machine learning.
no code implementations • NeurIPS 2016 • Tyler B. Johnson, Carlos Guestrin
We develop methods for rapidly identifying important components of a convex optimization problem for the purpose of achieving fast convergence times.
no code implementations • 22 Nov 2016 • Sameer Singh, Marco Tulio Ribeiro, Carlos Guestrin
Recent work in model-agnostic explanations of black-box machine learning has demonstrated that interpretability of complex models does not have to come at the cost of accuracy or model flexibility.
no code implementations • 17 Nov 2016 • Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin
At the core of interpretable machine learning is the question of whether humans are able to make accurate predictions about a model's behavior.
no code implementations • 16 Jun 2016 • Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin
Understanding why machine learning models behave the way they do empowers both system designers and end-users in many ways: in model selection, feature engineering, in order to trust and act upon the predictions, and in more intuitive user interfaces.
no code implementations • 1 Jun 2016 • Tianyi Zhou, Hua Ouyang, Yi Chang, Jeff Bilmes, Carlos Guestrin
We propose a new random pruning method (called "submodular sparsification (SS)") to reduce the cost of submodular maximization.
6 code implementations • 21 Apr 2016 • Tianqi Chen, Bing Xu, Chiyuan Zhang, Carlos Guestrin
In the extreme case, our analysis also shows that the memory consumption can be reduced to O(log n) with as little as O(n log n) extra cost for forward computation.
25 code implementations • 9 Mar 2016 • Tianqi Chen, Carlos Guestrin
In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges.
23 code implementations • 16 Feb 2016 • Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin
Despite widespread adoption, machine learning models remain mostly black boxes.
no code implementations • NeurIPS 2014 • Tianyi Zhou, Jeff Bilmes, Carlos Guestrin
We reduce a broad class of machine learning problems, usually addressed by EM or sampling, to the problem of finding the $k$ extremal rays spanning the conical hull of a data point set.
no code implementations • CVPR 2014 • Santosh K. Divvala, Ali Farhadi, Carlos Guestrin
How can we learn a model for any concept that exhaustively covers all its appearance variations, while requiring minimal or no human supervision for compiling the vocabulary of visual variance, gathering the training images and annotations, and learning the models?
5 code implementations • 17 Feb 2014 • Tianqi Chen, Emily B. Fox, Carlos Guestrin
Hamiltonian Monte Carlo (HMC) sampling methods provide a mechanism for defining distant proposals with high acceptance probabilities in a Metropolis-Hastings framework, enabling more efficient exploration of the state space than standard random-walk proposals.
no code implementations • 23 Jan 2014 • Jonathan Huang, Ashish Kapoor, Carlos Guestrin
Simultaneously addressing all of these challenges i. e., designing a compactly representable model which is amenable to efficient inference and can be learned using partial ranking data is a difficult task, but is necessary if we would like to scale to problems with nontrivial size.
no code implementations • 15 Jan 2014 • Amarjeet Singh, Andreas Krause, Carlos Guestrin, William J. Kaiser
In this paper, we present an efficient approach for near-optimally solving the NP-hard optimization problem of planning such informative paths.
no code implementations • 15 Jan 2014 • Andreas Krause, Carlos Guestrin
In a sensor network, for example, it is important to select the subset of sensors that is expected to provide the strongest reduction in uncertainty.
no code implementations • NeurIPS 2011 • Yisong Yue, Carlos Guestrin
Diversified retrieval and online learning are two core research areas in the design of modern information retrieval systems. In this paper, we propose the linear submodular bandits problem, which is an online learning setting for optimizing a general class of feature-rich submodular utility models for diversified retrieval.
no code implementations • NeurIPS 2010 • Danny Bickson, Carlos Guestrin
Using stable distributions, a heavy-tailed family of distributions which is a generalization of Cauchy, L\'evy and Gaussian distributions, we show for the first time, how to compute both exact and approximate inference in such a linear multivariate graphical model.
no code implementations • NeurIPS 2010 • Anton Chechetka, Carlos Guestrin
We present a simple and effective approach to learning tractable conditional random fields with structure that depends on the evidence.
2 code implementations • 25 Jun 2010 • Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, Joseph M. Hellerstein
Designing and implementing efficient, provably correct parallel machine learning (ML) algorithms is challenging.
no code implementations • NeurIPS 2009 • Jonathan Huang, Carlos Guestrin
Representing distributions over permutations can be a daunting task due to the fact that the number of permutations of n objects scales factorially in n. One recent way that has been used to reduce storage complexity has been to exploit probabilistic independence, but as we argue, full independence assumptions impose strong sparsity constraints on distributions and are unsuitable for modeling rankings.
no code implementations • NeurIPS 2007 • Anton Chechetka, Carlos Guestrin
We present the first truly polynomial algorithm for learning the structure of bounded-treewidth junction trees -- an attractive subclass of probabilistic graphical models that permits both the compact representation of probability distributions and efficient exact inference.
1 code implementation • SIGKDD 2007 • Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, Natalie Glance
We show that the approach scales, achieving speedups and savings in storage of several orders of magnitude.