no code implementations • 19 Mar 2025 • Kyurae Kim, Zuheng Xu, Jacob R. Gardner, Trevor Campbell
The performance of sequential Monte Carlo (SMC) samplers heavily depends on the tuning of the Markov kernels used in the path proposal.
no code implementations • 31 Jan 2025 • Natalie Maus, Kyurae Kim, Yimeng Zeng, Haydn Thomas Jones, Fangping Wan, Marcelo Der Torossian Torres, Cesar de la Fuente-Nunez, Jacob R. Gardner
In this work, we introduce a novel problem setting that departs from this paradigm: finding a smaller set of K solutions, where K < T, that collectively "covers" the T objectives.
no code implementations • 1 Nov 2024 • Jonathan Wenger, Kaiwen Wu, Philipp Hennig, Jacob R. Gardner, Geoff Pleiss, John P. Cunningham
Model selection in Gaussian processes scales prohibitively with the size of the training dataset, both in time and memory.
1 code implementation • 15 Jul 2024 • Kaiwen Wu, Jacob R. Gardner
Elliptical slice sampling, when adapted to linearly truncated multivariate normal distributions, is a rejection-free Markov chain Monte Carlo method.
no code implementations • 6 Jun 2024 • Natalie Maus, Kyurae Kim, Geoff Pleiss, David Eriksson, John P. Cunningham, Jacob R. Gardner
Our approach outperforms standard SVGPs on high-dimensional benchmark tasks in control and molecular design.
no code implementations • 5 Jun 2024 • Wentao Guo, Jikai Long, Yimeng Zeng, Zirui Liu, Xinyu Yang, Yide Ran, Jacob R. Gardner, Osbert Bastani, Christopher De Sa, Xiaodong Yu, Beidi Chen, Zhaozhuo Xu
Zeroth-order optimization (ZO) is a memory-efficient strategy for fine-tuning Large Language Models using only forward passes.
no code implementations • 4 Jun 2024 • Kaiwen Wu, Jacob R. Gardner
For conjugate likelihoods, we prove the first $\mathcal{O}(\frac{1}{T})$ non-asymptotic convergence rate of stochastic NGVI.
no code implementations • 3 Jun 2024 • Kyurae Kim, Joohwan Ko, Yi-An Ma, Jacob R. Gardner
For these problems, a popular strategy is to employ SGD with doubly stochastic gradients (doubly SGD): the expectations are estimated using the gradient estimator of each component, while the sum is estimated by subsampling over these estimators.
1 code implementation • 27 Feb 2024 • Samuel Gruffaz, Kyurae Kim, Alain Oliviero Durmus, Jacob R. Gardner
In practice, MCMC-SAEM is often run with asymptotically biased MCMC, for which the consequences are theoretically less understood.
no code implementations • 19 Jan 2024 • Joohwan Ko, Kyurae Kim, Woo Chang Kim, Jacob R. Gardner
In fact, recent computational complexity results for BBVI have established that full-rank variational families scale poorly with the dimensionality of the problem compared to e. g. mean-field families.
1 code implementation • 26 Oct 2023 • Kaiwen Wu, Jonathan Wenger, Haydn Jones, Geoff Pleiss, Jacob R. Gardner
Training and inference in Gaussian processes (GPs) require solving linear systems with $n\times n$ kernel matrices.
no code implementations • 27 Jul 2023 • Kyurae Kim, Yian Ma, Jacob R. Gardner
We prove that black-box variational inference (BBVI) with control variates, particularly the sticking-the-landing (STL) estimator, converges at a geometric (traditionally called "linear") rate under perfect variational family specification.
no code implementations • 25 May 2023 • Natalie Maus, Yimeng Zeng, Daniel Allen Anderson, Phillip Maffettone, Aaron Solomon, Peyton Greenside, Osbert Bastani, Jacob R. Gardner
Furthermore, it is challenging to adapt pure generative approaches to other settings, e. g., when constraints exist.
no code implementations • NeurIPS 2023 • Kyurae Kim, Jisu Oh, Kaiwen Wu, Yi-An Ma, Jacob R. Gardner
We provide the first convergence guarantee for full black-box variational inference (BBVI), also known as Monte Carlo variational inference.
1 code implementation • NeurIPS 2023 • Kaiwen Wu, Kyurae Kim, Roman Garnett, Jacob R. Gardner
A recent development in Bayesian optimization is the use of local optimization strategies, which can deliver strong empirical performance on high-dimensional problems compared to traditional global strategies.
no code implementations • 18 Mar 2023 • Kyurae Kim, Kaiwen Wu, Jisu Oh, Jacob R. Gardner
Understanding the gradient variance of black-box variational inference (BBVI) is a crucial step for establishing its convergence and developing algorithmic improvements.
1 code implementation • 21 Oct 2022 • Quan Nguyen, Kaiwen Wu, Jacob R. Gardner, Roman Garnett
Local optimization presents a promising approach to expensive, high-dimensional black-box optimization by sidestepping the need to globally explore the search space.
no code implementations • 10 Oct 2022 • Haoyu Wang, Hongming Zhang, Yuqian Deng, Jacob R. Gardner, Dan Roth, Muhao Chen
In this paper, we seek to improve the faithfulness of TempRel extraction models from two perspectives.
Ranked #3 on
Temporal Relation Classification
on MATRES
1 code implementation • 13 Jun 2022 • Kyurae Kim, Jisu Oh, Jacob R. Gardner, Adji Bousso Dieng, HongSeok Kim
Minimizing the inclusive Kullback-Leibler (KL) divergence with stochastic gradient descent (SGD) is challenging since its gradient is defined as an integral over the posterior.
1 code implementation • 28 Jan 2022 • Natalie Maus, Haydn T. Jones, Juston S. Moore, Matt J. Kusner, John Bradshaw, Jacob R. Gardner
By reformulating the encoder to function as both an encoder for the DAE globally and as a deep kernel for the surrogate model within a trust region, we better align the notion of local optimization in the latent space with local optimization in the input space.
no code implementations • NeurIPS 2021 • Misha Padidar, Xinran Zhu, Leo Huang, Jacob R. Gardner, David Bindel
We demonstrate the full scalability of our approach on a variety of tasks, ranging from a high dimensional stellarator fusion regression task to training graph convolutional neural networks on Pubmed using Bayesian optimization.
no code implementations • 1 Jul 2021 • Jonathan Wenger, Geoff Pleiss, Philipp Hennig, John P. Cunningham, Jacob R. Gardner
While preconditioning is well understood in the context of CG, we demonstrate that it can also accelerate convergence and reduce variance of the estimates for the log-determinant and its derivative.
1 code implementation • NeurIPS 2020 • Shali Jiang, Daniel R. Jiang, Maximilian Balandat, Brian Karrer, Jacob R. Gardner, Roman Garnett
In this paper, we provide the first efficient implementation of general multi-step lookahead Bayesian optimization, formulated as a sequence of nested optimization problems within a multi-step scenario tree.
1 code implementation • NeurIPS 2020 • Geoff Pleiss, Martin Jankowiak, David Eriksson, Anil Damle, Jacob R. Gardner
Matrix square roots and their inverses arise frequently in machine learning, e. g., when sampling from high-dimensional Gaussians $\mathcal{N}(\mathbf 0, \mathbf K)$ or whitening a vector $\mathbf b$ against covariance matrix $\mathbf K$.
no code implementations • 21 Feb 2020 • Martin Jankowiak, Geoff Pleiss, Jacob R. Gardner
We introduce Deep Sigma Point Processes, a class of parametric models inspired by the compositional structure of Deep Gaussian Processes (DGPs).
no code implementations • ICML 2020 • Martin Jankowiak, Geoff Pleiss, Jacob R. Gardner
In an extensive empirical comparison with a number of alternative methods for scalable GP regression, we find that the resulting predictive distributions exhibit significantly better calibrated uncertainties and higher log likelihoods--often by as much as half a nat per datapoint.
3 code implementations • NeurIPS 2019 • David Eriksson, Michael Pearce, Jacob R. Gardner, Ryan Turner, Matthias Poloczek
This motivates the design of a local probabilistic approach for global optimization of large-scale high-dimensional problems.
4 code implementations • ICLR 2019 • Chuan Guo, Jacob R. Gardner, Yurong You, Andrew Gordon Wilson, Kilian Q. Weinberger
We propose an intriguingly simple method for the construction of adversarial images in the black-box setting.
3 code implementations • NeurIPS 2019 • Ke Alexander Wang, Geoff Pleiss, Jacob R. Gardner, Stephen Tyree, Kilian Q. Weinberger, Andrew Gordon Wilson
Gaussian processes (GPs) are flexible non-parametric models, with a capacity that grows with the available data.
4 code implementations • NeurIPS 2018 • Jacob R. Gardner, Geoff Pleiss, David Bindel, Kilian Q. Weinberger, Andrew Gordon Wilson
Despite advances in scalable models, the inference tools used for Gaussian processes (GPs) have yet to fully capitalize on developments in computing hardware.
1 code implementation • ICML 2018 • Geoff Pleiss, Jacob R. Gardner, Kilian Q. Weinberger, Andrew Gordon Wilson
One of the most compelling features of Gaussian process (GP) regression is its ability to provide well-calibrated posterior distributions.
1 code implementation • 24 Feb 2018 • Jacob R. Gardner, Geoff Pleiss, Ruihan Wu, Kilian Q. Weinberger, Andrew Gordon Wilson
Recent work shows that inference for Gaussian processes can be performed efficiently using iterative methods that rely only on matrix-vector multiplications (MVMs).
no code implementations • 19 Nov 2015 • Jacob R. Gardner, Paul Upchurch, Matt J. Kusner, Yixuan Li, Kilian Q. Weinberger, Kavita Bala, John E. Hopcroft
Many tasks in computer vision can be cast as a "label changing" problem, where the goal is to make a semantic change to the appearance of an image or some subject in an image in order to alter the class membership.
no code implementations • 26 Jan 2015 • Zhixiang Xu, Jacob R. Gardner, Stephen Tyree, Kilian Q. Weinberger
For most of the time during which we conducted this research, we were unaware of this prior work.
no code implementations • 16 Jan 2015 • Matt J. Kusner, Jacob R. Gardner, Roman Garnett, Kilian Q. Weinberger
The success of machine learning has led practitioners in diverse real-world settings to learn classifiers for practical problems.
no code implementations • 6 Sep 2014 • Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen
Our second contribution is to derive a practical algorithm based on this reduction.
no code implementations • 3 Apr 2014 • Stephen Tyree, Jacob R. Gardner, Kilian Q. Weinberger, Kunal Agrawal, John Tran
In particular, we provide the first comparison of algorithms with explicit and implicit parallelization.