1 code implementation • 11 Apr 2024 • Rishabh Ranjan, Saurabh Garg, Mrigank Raman, Carlos Guestrin, Zachary Chase Lipton

This phenomenon is especially prominent in high-noise settings.

no code implementations • 7 Mar 2024 • Liana Patel, Peter Kraft, Carlos Guestrin, Matei Zaharia

Applications increasingly leverage mixed-modality data, and must jointly search over vector data, such as embedded images, text and video, as well as structured data, such as attributes and keywords.

no code implementations • 20 Nov 2023 • Theodora Worledge, Judy Hanwen Shen, Nicole Meister, Caleb Winston, Carlos Guestrin

As businesses, products, and services spring up around large language models, the trustworthiness of these models hinges on the verifiability of their outputs.

1 code implementation • 20 Oct 2023 • Yu Sun, Xinhao Li, Karan Dalal, Chloe Hsu, Sanmi Koyejo, Carlos Guestrin, Xiaolong Wang, Tatsunori Hashimoto, Xinlei Chen

Our inner loop turns out to be equivalent to linear attention when the inner-loop learner is only a linear model, and to self-attention when it is a kernel estimator.

2 code implementations • NeurIPS 2023 • Yann Dubois, Xuechen Li, Rohan Taori, Tianyi Zhang, Ishaan Gulrajani, Jimmy Ba, Carlos Guestrin, Percy Liang, Tatsunori B. Hashimoto

As a demonstration of the research possible in AlpacaFarm, we find that methods that use a reward model can substantially improve over supervised fine-tuning and that our reference PPO implementation leads to a +10% improvement in win-rate against Davinci003.

no code implementations • 11 Feb 2023 • Daniel Kang, Xuechen Li, Ion Stoica, Carlos Guestrin, Matei Zaharia, Tatsunori Hashimoto

Recent advances in instruction-following large language models (LLMs) have led to dramatic improvements in a range of NLP tasks.

1 code implementation • 20 Feb 2021 • Mitchell Wortsman, Maxwell Horton, Carlos Guestrin, Ali Farhadi, Mohammad Rastegari

Recent observations have advanced our understanding of the neural network optimization landscape, revealing the existence of (1) paths of high accuracy containing diverse solutions and (2) wider minima offering improved performance.

1 code implementation • ICML 2020 • Tyler B. Johnson, Pulkit Agrawal, Haijie Gu, Carlos Guestrin

When using large-batch training to speed up stochastic gradient descent, learning rates must adapt to new batch sizes in order to maximize speed-ups and preserve model quality.

no code implementations • 18 Jun 2020 • Shuangfei Zhai, Walter Talbott, Miguel Angel Bautista, Carlos Guestrin, Josh M. Susskind

We introduce Set Distribution Networks (SDNs), a novel framework that learns to autoencode and freely generate sets.

1 code implementation • ICML 2020 • Emilien Dupont, Miguel Angel Bautista, Alex Colburn, Aditya Sankar, Carlos Guestrin, Josh Susskind, Qi Shan

We propose a framework for learning neural scene representations directly from images, without 3D supervision.

4 code implementations • ACL 2020 • Marco Tulio Ribeiro, Tongshuang Wu, Carlos Guestrin, Sameer Singh

Although measuring held-out accuracy has been the primary approach to evaluate generalization, it often overestimates the performance of NLP models, while alternative approaches for evaluating models either focus on individual tasks or on specific behaviors.

1 code implementation • NeurIPS 2019 • Shuangfei Zhai, Walter Talbott, Carlos Guestrin, Joshua M. Susskind

In contrast to a traditional view where the discriminator learns a constant function when reaching convergence, here we show that it can provide useful information for downstream tasks, e. g., feature extraction for classification.

no code implementations • 25 Sep 2019 • Tyler B. Johnson, Pulkit Agrawal, Haijie Gu, Carlos Guestrin

When using distributed training to speed up stochastic gradient descent, learning rates must adapt to new scales in order to maintain training effectiveness.

no code implementations • 25 Sep 2019 • Shuangfei Zhai, Carlos Guestrin, Joshua M. Susskind

During inference time, the HBAE consists of two sampling steps: first a latent code for the input is sampled, and then this code is passed to the conditional generator to output a stochastic reconstruction.

1 code implementation • ACL 2019 • Marco Tulio Ribeiro, Carlos Guestrin, Sameer Singh

Although current evaluation of question-answering systems treats predictions in isolation, we need to consider the relationship between predictions to measure true understanding.

no code implementations • 15 May 2019 • Chen Huang, Shuangfei Zhai, Walter Talbott, Miguel Angel Bautista, Shih-Yu Sun, Carlos Guestrin, Josh Susskind

In most machine learning training paradigms a fixed, often handcrafted, loss function is assumed to be a good proxy for an underlying evaluation metric.

no code implementations • NeurIPS 2018 • Tyler B. Johnson, Carlos Guestrin

In theory, importance sampling speeds up stochastic gradient algorithms for supervised learning by prioritizing training examples.

no code implementations • 20 Jul 2018 • Tyler B. Johnson, Carlos Guestrin

By reducing optimization to a sequence of smaller subproblems, working set algorithms achieve fast convergence times for many machine learning problems.

no code implementations • 11 Jul 2018 • Thierry Moreau, Tianqi Chen, Luis Vega, Jared Roesch, Eddie Yan, Lianmin Zheng, Josh Fromm, Ziheng Jiang, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy

Specialized Deep Learning (DL) acceleration stacks, designed for a specific set of frameworks, model architectures, operators, and data types, offer the allure of high performance while sacrificing flexibility.

1 code implementation • ACL 2018 • Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin

Complex machine learning models for NLP are often brittle, making different predictions for input instances that are extremely similar semantically.

no code implementations • NeurIPS 2018 • Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy

Efficient implementations of tensor operators, such as matrix multiplication and high dimensional convolution, are key enablers of effective deep learning systems.

1 code implementation • 1 May 2018 • Pouya Pezeshkpour, Carlos Guestrin, Sameer Singh

Matrix factorization is a well-studied task in machine learning for compactly representing large, noisy data.

1 code implementation • 12 Feb 2018 • Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy

Experimental results show that TVM delivers performance across hardware back-ends that are competitive with state-of-the-art, hand-tuned libraries for low-power CPU, mobile GPU, and server-class GPUs.

no code implementations • ICML 2017 • Tyler B. Johnson, Carlos Guestrin

Coordinate descent (CD) is a scalable and simple algorithm for solving many optimization problems in machine learning.

no code implementations • NeurIPS 2016 • Tyler B. Johnson, Carlos Guestrin

We develop methods for rapidly identifying important components of a convex optimization problem for the purpose of achieving fast convergence times.

no code implementations • 22 Nov 2016 • Sameer Singh, Marco Tulio Ribeiro, Carlos Guestrin

Recent work in model-agnostic explanations of black-box machine learning has demonstrated that interpretability of complex models does not have to come at the cost of accuracy or model flexibility.

no code implementations • 17 Nov 2016 • Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin

At the core of interpretable machine learning is the question of whether humans are able to make accurate predictions about a model's behavior.

no code implementations • 16 Jun 2016 • Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin

Understanding why machine learning models behave the way they do empowers both system designers and end-users in many ways: in model selection, feature engineering, in order to trust and act upon the predictions, and in more intuitive user interfaces.

no code implementations • 1 Jun 2016 • Tianyi Zhou, Hua Ouyang, Yi Chang, Jeff Bilmes, Carlos Guestrin

We propose a new random pruning method (called "submodular sparsification (SS)") to reduce the cost of submodular maximization.

6 code implementations • 21 Apr 2016 • Tianqi Chen, Bing Xu, Chiyuan Zhang, Carlos Guestrin

In the extreme case, our analysis also shows that the memory consumption can be reduced to O(log n) with as little as O(n log n) extra cost for forward computation.

25 code implementations • 9 Mar 2016 • Tianqi Chen, Carlos Guestrin

In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges.

27 code implementations • 16 Feb 2016 • Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin

Despite widespread adoption, machine learning models remain mostly black boxes.

no code implementations • NeurIPS 2014 • Tianyi Zhou, Jeff Bilmes, Carlos Guestrin

We reduce a broad class of machine learning problems, usually addressed by EM or sampling, to the problem of finding the $k$ extremal rays spanning the conical hull of a data point set.

no code implementations • CVPR 2014 • Santosh K. Divvala, Ali Farhadi, Carlos Guestrin

How can we learn a model for any concept that exhaustively covers all its appearance variations, while requiring minimal or no human supervision for compiling the vocabulary of visual variance, gathering the training images and annotations, and learning the models?

5 code implementations • 17 Feb 2014 • Tianqi Chen, Emily B. Fox, Carlos Guestrin

Hamiltonian Monte Carlo (HMC) sampling methods provide a mechanism for defining distant proposals with high acceptance probabilities in a Metropolis-Hastings framework, enabling more efficient exploration of the state space than standard random-walk proposals.

no code implementations • 23 Jan 2014 • Jonathan Huang, Ashish Kapoor, Carlos Guestrin

Simultaneously addressing all of these challenges i. e., designing a compactly representable model which is amenable to efficient inference and can be learned using partial ranking data is a difficult task, but is necessary if we would like to scale to problems with nontrivial size.

no code implementations • 15 Jan 2014 • Amarjeet Singh, Andreas Krause, Carlos Guestrin, William J. Kaiser

In this paper, we present an efficient approach for near-optimally solving the NP-hard optimization problem of planning such informative paths.

no code implementations • 15 Jan 2014 • Andreas Krause, Carlos Guestrin

In a sensor network, for example, it is important to select the subset of sensors that is expected to provide the strongest reduction in uncertainty.

no code implementations • NeurIPS 2011 • Yisong Yue, Carlos Guestrin

Diversified retrieval and online learning are two core research areas in the design of modern information retrieval systems. In this paper, we propose the linear submodular bandits problem, which is an online learning setting for optimizing a general class of feature-rich submodular utility models for diversified retrieval.

no code implementations • NeurIPS 2010 • Anton Chechetka, Carlos Guestrin

We present a simple and effective approach to learning tractable conditional random fields with structure that depends on the evidence.

no code implementations • NeurIPS 2010 • Danny Bickson, Carlos Guestrin

Using stable distributions, a heavy-tailed family of distributions which is a generalization of Cauchy, L\'evy and Gaussian distributions, we show for the first time, how to compute both exact and approximate inference in such a linear multivariate graphical model.

2 code implementations • 25 Jun 2010 • Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, Joseph M. Hellerstein

Designing and implementing efficient, provably correct parallel machine learning (ML) algorithms is challenging.

no code implementations • NeurIPS 2009 • Jonathan Huang, Carlos Guestrin

Representing distributions over permutations can be a daunting task due to the fact that the number of permutations of n objects scales factorially in n. One recent way that has been used to reduce storage complexity has been to exploit probabilistic independence, but as we argue, full independence assumptions impose strong sparsity constraints on distributions and are unsuitable for modeling rankings.

no code implementations • NeurIPS 2007 • Anton Chechetka, Carlos Guestrin

We present the first truly polynomial algorithm for learning the structure of bounded-treewidth junction trees -- an attractive subclass of probabilistic graphical models that permits both the compact representation of probability distributions and efficient exact inference.

1 code implementation • SIGKDD 2007 • Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, Natalie Glance

We show that the approach scales, achieving speedups and savings in storage of several orders of magnitude.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.