no code implementations • 28 Oct 2024 • Joe Benton, Misha Wagner, Eric Christiansen, Cem Anil, Ethan Perez, Jai Srivastav, Esin Durmus, Deep Ganguli, Shauna Kravec, Buck Shlegeris, Jared Kaplan, Holden Karnofsky, Evan Hubinger, Roger Grosse, Samuel R. Bowman, David Duvenaud
We develop a set of related threat models and evaluations.
1 code implementation • 14 Jun 2024 • Carson Denison, Monte MacDiarmid, Fazl Barez, David Duvenaud, Shauna Kravec, Samuel Marks, Nicholas Schiefer, Ryan Soklaski, Alex Tamkin, Jared Kaplan, Buck Shlegeris, Samuel R. Bowman, Ethan Perez, Evan Hubinger
We construct a curriculum of increasingly sophisticated gameable environments and find that training on early-curriculum environments leads to more specification gaming on remaining environments.
1 code implementation • 21 May 2024 • James Requeima, John Bronskill, Dami Choi, Richard E. Turner, David Duvenaud
Machine learning practitioners often face significant challenges in formally integrating their prior knowledge and beliefs into predictive models, limiting the potential for nuanced and context-aware analyses.
no code implementations • 13 Feb 2024 • Daniel D. Johnson, Daniel Tarlow, David Duvenaud, Chris J. Maddison
Identifying how much a model ${\widehat{p}}_{\theta}(Y|X)$ knows about the stochastic real-world process $p(Y|X)$ it was trained on is important to ensure it avoids producing incorrect or "hallucinated" answers or taking unsafe actions.
1 code implementation • 10 Jan 2024 • Evan Hubinger, Carson Denison, Jesse Mu, Mike Lambert, Meg Tong, Monte MacDiarmid, Tamera Lanham, Daniel M. Ziegler, Tim Maxwell, Newton Cheng, Adam Jermyn, Amanda Askell, Ansh Radhakrishnan, Cem Anil, David Duvenaud, Deep Ganguli, Fazl Barez, Jack Clark, Kamal Ndousse, Kshitij Sachan, Michael Sellitto, Mrinank Sharma, Nova DasSarma, Roger Grosse, Shauna Kravec, Yuntao Bai, Zachary Witten, Marina Favaro, Jan Brauner, Holden Karnofsky, Paul Christiano, Samuel R. Bowman, Logan Graham, Jared Kaplan, Sören Mindermann, Ryan Greenblatt, Buck Shlegeris, Nicholas Schiefer, Ethan Perez
We find that such backdoor behavior can be made persistent, so that it is not removed by standard safety training techniques, including supervised fine-tuning, reinforcement learning, and adversarial training (eliciting unsafe behavior and then training to remove it).
no code implementations • 9 Nov 2023 • Jack Richter-Powell, Luca Thiede, Alán Asparu-Guzik, David Duvenaud
Molecular modeling at the quantum level requires choosing a parameterization of the wavefunction that both respects the required particle symmetries, and is scalable to systems of many particles.
1 code implementation • 20 Oct 2023 • Mrinank Sharma, Meg Tong, Tomasz Korbak, David Duvenaud, Amanda Askell, Samuel R. Bowman, Newton Cheng, Esin Durmus, Zac Hatfield-Dodds, Scott R. Johnston, Shauna Kravec, Timothy Maxwell, Sam McCandlish, Kamal Ndousse, Oliver Rausch, Nicholas Schiefer, Da Yan, Miranda Zhang, Ethan Perez
Overall, our results indicate that sycophancy is a general behavior of state-of-the-art AI assistants, likely driven in part by human preference judgments favoring sycophantic responses.
no code implementations • 28 Dec 2022 • Paul Vicol, Jonathan Lorraine, Fabian Pedregosa, David Duvenaud, Roger Grosse
Bilevel problems consist of two nested sub-problems, called the outer and inner problems, respectively.
no code implementations • NeurIPS 2021 • Aniruddh Raghu, Jonathan Lorraine, Simon Kornblith, Matthew McDermott, David Duvenaud
Pre-training (PT) followed by fine-tuning (FT) is an effective method for training neural networks, and has led to significant performance improvements in many domains.
no code implementations • 16 Feb 2021 • Jonathan Lorraine, David Acuna, Paul Vicol, David Duvenaud
We generalize gradient descent with momentum for optimization in differentiable games to have complex-valued momentum.
2 code implementations • 12 Feb 2021 • Winnie Xu, Ricky T. Q. Chen, Xuechen Li, David Duvenaud
We perform scalable approximate inference in continuous-depth Bayesian neural networks.
1 code implementation • 8 Feb 2021 • Will Grathwohl, Kevin Swersky, Milad Hashemi, David Duvenaud, Chris J. Maddison
We propose a general and scalable approximate sampling strategy for probabilistic models with discrete variables.
no code implementations • NeurIPS Workshop ICBINB 2020 • Ricky T. Q. Chen, Dami Choi, Lukas Balles, David Duvenaud, Philipp Hennig
Standard first-order stochastic optimization algorithms base their updates solely on the average mini-batch gradient, and it has been shown that tracking additional quantities such as the curvature can help de-sensitize common hyperparameters.
1 code implementation • ICLR 2021 • Aniruddh Raghu, Maithra Raghu, Simon Kornblith, David Duvenaud, Geoffrey Hinton
We find that commentaries can improve training speed and/or performance, and provide insights about the dataset and training process.
1 code implementation • ICLR 2021 • Will Grathwohl, Jacob Kelly, Milad Hashemi, Mohammad Norouzi, Kevin Swersky, David Duvenaud
Energy-Based Models (EBMs) present a flexible and appealing way to represent uncertainty.
1 code implementation • NeurIPS 2020 • Jacob Kelly, Jesse Bettencourt, Matthew James Johnson, David Duvenaud
Differential equations parameterized by neural networks become expensive to solve numerically as training progresses.
1 code implementation • 9 Jul 2020 • Fartash Faghri, David Duvenaud, David J. Fleet, Jimmy Ba
We introduce a method, Gradient Clustering, to minimize the variance of average mini-batch gradient with stratified sampling.
no code implementations • ICLR 2020 • Yucen Luo, Alex Beatson, Mohammad Norouzi, Jun Zhu, David Duvenaud, Ryan P. Adams, Ricky T. Q. Chen
Standard variational lower bounds used to train latent variable models produce biased estimates of most quantities of interest.
no code implementations • 5 Mar 2020 • Sana Tonekaboni, Shalmali Joshi, Kieran Campbell, David Duvenaud, Anna Goldenberg
Explanations of time series models are useful for high stakes applications like healthcare but have received little attention in machine learning literature.
1 code implementation • ICML 2020 • Will Grathwohl, Kuan-Chieh Wang, Jorn-Henrik Jacobsen, David Duvenaud, Richard Zemel
We estimate the Stein discrepancy between the data density $p(x)$ and the model density $q(x)$ defined by a vector function of the data.
4 code implementations • 5 Jan 2020 • Xuechen Li, Ting-Kam Leonard Wong, Ricky T. Q. Chen, David Duvenaud
The adjoint sensitivity method scalably computes gradients of solutions to ordinary differential equations.
Ranked #1 on Video Prediction on CMU Mocap-2
no code implementations • 8 Dec 2019 • Ricky T. Q. Chen, David Duvenaud
Gradients of neural networks can be computed efficiently for any architecture, but some applications require differential operators with higher time complexity.
4 code implementations • ICLR 2020 • Will Grathwohl, Kuan-Chieh Wang, Jörn-Henrik Jacobsen, David Duvenaud, Mohammad Norouzi, Kevin Swersky
In this setting, the standard class probabilities can be easily computed as well as unnormalized values of p(x) and p(x|y).
9 code implementations • 6 Nov 2019 • Jonathan Lorraine, Paul Vicol, David Duvenaud
We propose an algorithm for inexpensive gradient-based hyperparameter optimization that combines the implicit function theorem (IFT) with efficient inverse Hessian approximations.
2 code implementations • NeurIPS 2019 • Renjie Liao, Yujia Li, Yang Song, Shenlong Wang, Charlie Nash, William L. Hamilton, David Duvenaud, Raquel Urtasun, Richard S. Zemel
Our model generates graphs one block of nodes and associated edges at a time.
no code implementations • 25 Sep 2019 • Sana Tonekaboni, Shalmali Joshi, David Duvenaud, Anna Goldenberg
We propose a method to automatically compute the importance of features at every observation in time series, by simulating counterfactual trajectories given previous observations.
no code implementations • ACL 2019 • Kawin Ethayarajh, David Duvenaud, Graeme Hirst
Word embeddings are often criticized for capturing undesirable word associations such as gender stereotypes.
11 code implementations • 8 Jul 2019 • Yulia Rubanova, Ricky T. Q. Chen, David Duvenaud
Time series with non-uniform intervals occur in many applications, and are difficult to model using standard recurrent neural networks (RNNs).
Ranked #1 on Multivariate Time Series Imputation on MuJoCo
Multivariate Time Series Forecasting Multivariate Time Series Imputation +3
4 code implementations • NeurIPS 2019 • Ricky T. Q. Chen, Jens Behrmann, David Duvenaud, Jörn-Henrik Jacobsen
Flow-based generative models parameterize probability distributions through an invertible transformation and can be trained by maximum likelihood.
Ranked #2 on Image Generation on MNIST
3 code implementations • ICLR 2019 • Matthew MacKay, Paul Vicol, Jon Lorraine, David Duvenaud, Roger Grosse
Empirically, our approach outperforms competing hyperparameter optimization methods on large-scale deep learning problems.
5 code implementations • 2 Nov 2018 • Jens Behrmann, Will Grathwohl, Ricky T. Q. Chen, David Duvenaud, Jörn-Henrik Jacobsen
We show that standard ResNet architectures can be made invertible, allowing the same model to be used for classification, density estimation, and generation.
Ranked #5 on Image Generation on MNIST
no code implementations • 20 Oct 2018 • Elias Tragas, Calvin Luo, Maxime Yvez, Kevin Luk, David Duvenaud
A popular matrix completion algorithm is matrix factorization, where ratings are predicted from combining learned user and item parameter vectors.
no code implementations • ACL 2019 • Kawin Ethayarajh, David Duvenaud, Graeme Hirst
A surprising property of word vectors is that word analogies can often be solved with vector arithmetic.
7 code implementations • ICLR 2019 • Will Grathwohl, Ricky T. Q. Chen, Jesse Bettencourt, Ilya Sutskever, David Duvenaud
The result is a continuous-time invertible generative model with unbiased density estimation and one-pass sampling, while allowing unrestricted neural network architectures.
Ranked #1 on Density Estimation on UCI MINIBOONE
no code implementations • 20 Aug 2018 • George A. Adam, Petr Smirnov, David Duvenaud, Benjamin Haibe-Kains, Anna Goldenberg
Many deep learning algorithms can be easily fooled with simple adversarial examples.
1 code implementation • ICLR 2019 • Chun-Hao Chang, Elliot Creager, Anna Goldenberg, David Duvenaud
We can rephrase this question to ask: which parts of the image, if they were not seen by the classifier, would most change its decision?
no code implementations • 5 Jul 2018 • Elias Tragas, Calvin Luo, Maxime Gazeau, Kevin Luk, David Duvenaud
Recommender systems can be formulated as a matrix completion problem, predicting ratings from user and item parameter vectors.
56 code implementations • NeurIPS 2018 • Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, David Duvenaud
Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network.
Ranked #2 on Multivariate Time Series Imputation on MuJoCo
Multivariate Time Series Forecasting Multivariate Time Series Imputation
1 code implementation • ICLR 2018 • Jonathan Lorraine, David Duvenaud
Machine learning models are often tuned by nesting optimization of model weights inside the optimization of hyperparameters.
10 code implementations • NeurIPS 2018 • Ricky T. Q. Chen, Xuechen Li, Roger Grosse, David Duvenaud
We decompose the evidence lower bound to show the existence of a term measuring the total correlation between latent variables.
2 code implementations • ICML 2018 • Chris Cremer, Xuechen Li, David Duvenaud
Furthermore, we show that the parameters used to increase the expressiveness of the approximation play a role in generalizing inference rather than simply improving the complexity of the approximation.
2 code implementations • 17 Dec 2017 • Nathan Killoran, Leo J. Lee, Andrew Delong, David Duvenaud, Brendan J. Frey
We propose generative neural network methods to generate DNA sequences and tune them to have desired properties.
2 code implementations • ICML 2018 • Guodong Zhang, Shengyang Sun, David Duvenaud, Roger Grosse
Variational Bayesian neural nets combine the flexibility of deep learning with Bayesian uncertainty estimation.
7 code implementations • ICLR 2018 • Will Grathwohl, Dami Choi, Yuhuai Wu, Geoffrey Roeder, David Duvenaud
Gradient-based optimization is the foundation of deep learning and reinforcement learning.
no code implementations • 10 Apr 2017 • Chris Cremer, Quaid Morris, David Duvenaud
The standard interpretation of importance-weighted autoencoders is that they maximize a tighter lower bound on the marginal likelihood than the standard evidence lower bound.
1 code implementation • NeurIPS 2017 • Geoffrey Roeder, Yuhuai Wu, David Duvenaud
We propose a simple and general variant of the standard reparameterized gradient estimator for the variational evidence lower bound.
11 code implementations • 7 Oct 2016 • Rafael Gómez-Bombarelli, Jennifer N. Wei, David Duvenaud, José Miguel Hernández-Lobato, Benjamín Sánchez-Lengeling, Dennis Sheberla, Jorge Aguilera-Iparraguirre, Timothy D. Hirzel, Ryan P. Adams, Alán Aspuru-Guzik
We report a method to convert discrete representations of molecules to and from a multidimensional continuous representation.
no code implementations • 22 Aug 2016 • Jennifer N. Wei, David Duvenaud, Alán Aspuru-Guzik
Reaction prediction remains one of the major challenges for organic chemistry, and is a pre-requisite for efficient synthetic planning.
3 code implementations • NeurIPS 2016 • Matthew J. Johnson, David Duvenaud, Alexander B. Wiltschko, Sandeep R. Datta, Ryan P. Adams
We propose a general modeling and inference framework that composes probabilistic graphical models with deep learning methods and combines their respective strengths.
8 code implementations • NeurIPS 2015 • David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre, Rafael Gómez-Bombarelli, Timothy Hirzel, Alán Aspuru-Guzik, Ryan P. Adams
We introduce a convolutional neural network that operates directly on graphs.
Ranked #2 on Drug Discovery on HIV dataset
1 code implementation • 6 Apr 2015 • Dougal Maclaurin, David Duvenaud, Ryan P. Adams
By tracking the change in entropy over this sequence of transformations during optimization, we form a scalable, unbiased estimate of the variational lower bound on the log marginal likelihood.
2 code implementations • 11 Feb 2015 • Dougal Maclaurin, David Duvenaud, Ryan P. Adams
Tuning hyperparameters of learning algorithms is hard because gradients are usually unavailable.
no code implementations • 14 Sep 2014 • Kevin Swersky, David Duvenaud, Jasper Snoek, Frank Hutter, Michael A. Osborne
In practical Bayesian optimization, we must often search over structures with differing numbers of parameters.
1 code implementation • 9 Aug 2014 • Tomoharu Iwata, David Duvenaud, Zoubin Ghahramani
A mixture of Gaussians fit to a single curved or heavy-tailed cluster will report that the data contains many clusters.
no code implementations • 9 Aug 2014 • Ferenc Huszar, David Duvenaud
We show that the criterion minimised when selecting samples in kernel herding is equivalent to the posterior variance in Bayesian quadrature.
no code implementations • NeurIPS 2014 • Michael Schober, David Duvenaud, Philipp Hennig
We construct a family of probabilistic numerical methods that instead return a Gauss-Markov process defining a probability distribution over the ODE solution.
2 code implementations • 24 Feb 2014 • David Duvenaud, Oren Rippel, Ryan P. Adams, Zoubin Ghahramani
Choosing appropriate architectures and regularization strategies for deep networks is crucial to good predictive performance.
3 code implementations • 18 Feb 2014 • James Robert Lloyd, David Duvenaud, Roger Grosse, Joshua B. Tenenbaum, Zoubin Ghahramani
This paper presents the beginnings of an automatic statistician, focusing on regression problems.
5 code implementations • 20 Feb 2013 • David Duvenaud, James Robert Lloyd, Roger Grosse, Joshua B. Tenenbaum, Zoubin Ghahramani
Despite its importance, choosing the structural form of the kernel in nonparametric regression remains a black art.
1 code implementation • 8 Jun 2012 • Tomoharu Iwata, David Duvenaud, Zoubin Ghahramani
A mixture of Gaussians fit to a single curved or heavy-tailed cluster will report that the data contains many clusters.
1 code implementation • 7 Apr 2012 • Ferenc Huszár, David Duvenaud
We show that the criterion minimised when selecting samples in kernel herding is equivalent to the posterior variance in Bayesian quadrature.
1 code implementation • NeurIPS 2011 • David Duvenaud, Hannes Nickisch, Carl Edward Rasmussen
We introduce a Gaussian process model of functions which are additive.