Search Results for author: David Duvenaud

Found 56 papers, 37 papers with code

Tools for Verifying Neural Models' Training Data

no code implementations2 Jul 2023 Dami Choi, Yonadav Shavit, David Duvenaud

It is important that consumers and regulators can verify the provenance of large neural models to evaluate their capabilities and risks.

Meta-Learning to Improve Pre-Training

no code implementations NeurIPS 2021 Aniruddh Raghu, Jonathan Lorraine, Simon Kornblith, Matthew McDermott, David Duvenaud

Pre-training (PT) followed by fine-tuning (FT) is an effective method for training neural networks, and has led to significant performance improvements in many domains.

Data Augmentation Hyperparameter Optimization +1

Complex Momentum for Optimization in Games

no code implementations16 Feb 2021 Jonathan Lorraine, David Acuna, Paul Vicol, David Duvenaud

We generalize gradient descent with momentum for optimization in differentiable games to have complex-valued momentum.

Oops I Took A Gradient: Scalable Sampling for Discrete Distributions

1 code implementation8 Feb 2021 Will Grathwohl, Kevin Swersky, Milad Hashemi, David Duvenaud, Chris J. Maddison

We propose a general and scalable approximate sampling strategy for probabilistic models with discrete variables.

Self-Tuning Stochastic Optimization with Curvature-Aware Gradient Filtering

no code implementations NeurIPS Workshop ICBINB 2020 Ricky T. Q. Chen, Dami Choi, Lukas Balles, David Duvenaud, Philipp Hennig

Standard first-order stochastic optimization algorithms base their updates solely on the average mini-batch gradient, and it has been shown that tracking additional quantities such as the curvature can help de-sensitize common hyperparameters.

Stochastic Optimization

Teaching with Commentaries

1 code implementation ICLR 2021 Aniruddh Raghu, Maithra Raghu, Simon Kornblith, David Duvenaud, Geoffrey Hinton

We find that commentaries can improve training speed and/or performance, and provide insights about the dataset and training process.

Data Augmentation

A Study of Gradient Variance in Deep Learning

1 code implementation9 Jul 2020 Fartash Faghri, David Duvenaud, David J. Fleet, Jimmy Ba

We introduce a method, Gradient Clustering, to minimize the variance of average mini-batch gradient with stratified sampling.


SUMO: Unbiased Estimation of Log Marginal Probability for Latent Variable Models

no code implementations ICLR 2020 Yucen Luo, Alex Beatson, Mohammad Norouzi, Jun Zhu, David Duvenaud, Ryan P. Adams, Ricky T. Q. Chen

Standard variational lower bounds used to train latent variable models produce biased estimates of most quantities of interest.

What went wrong and when? Instance-wise Feature Importance for Time-series Models

no code implementations5 Mar 2020 Sana Tonekaboni, Shalmali Joshi, Kieran Campbell, David Duvenaud, Anna Goldenberg

Explanations of time series models are useful for high stakes applications like healthcare but have received little attention in machine learning literature.

Feature Importance Time Series +1

Learning the Stein Discrepancy for Training and Evaluating Energy-Based Models without Sampling

1 code implementation ICML 2020 Will Grathwohl, Kuan-Chieh Wang, Jorn-Henrik Jacobsen, David Duvenaud, Richard Zemel

We estimate the Stein discrepancy between the data density $p(x)$ and the model density $q(x)$ defined by a vector function of the data.

Neural Networks with Cheap Differential Operators

no code implementations8 Dec 2019 Ricky T. Q. Chen, David Duvenaud

Gradients of neural networks can be computed efficiently for any architecture, but some applications require differential operators with higher time complexity.

Optimizing Millions of Hyperparameters by Implicit Differentiation

9 code implementations6 Nov 2019 Jonathan Lorraine, Paul Vicol, David Duvenaud

We propose an algorithm for inexpensive gradient-based hyperparameter optimization that combines the implicit function theorem (IFT) with efficient inverse Hessian approximations.

Data Augmentation Hyperparameter Optimization

Explaining Time Series by Counterfactuals

no code implementations25 Sep 2019 Sana Tonekaboni, Shalmali Joshi, David Duvenaud, Anna Goldenberg

We propose a method to automatically compute the importance of features at every observation in time series, by simulating counterfactual trajectories given previous observations.

Feature Importance Time Series +1

Understanding Undesirable Word Embedding Associations

no code implementations ACL 2019 Kawin Ethayarajh, David Duvenaud, Graeme Hirst

Word embeddings are often criticized for capturing undesirable word associations such as gender stereotypes.

Word Embeddings

Latent ODEs for Irregularly-Sampled Time Series

10 code implementations8 Jul 2019 Yulia Rubanova, Ricky T. Q. Chen, David Duvenaud

Time series with non-uniform intervals occur in many applications, and are difficult to model using standard recurrent neural networks (RNNs).

Multivariate Time Series Forecasting Multivariate Time Series Imputation +3

Residual Flows for Invertible Generative Modeling

4 code implementations NeurIPS 2019 Ricky T. Q. Chen, Jens Behrmann, David Duvenaud, Jörn-Henrik Jacobsen

Flow-based generative models parameterize probability distributions through an invertible transformation and can be trained by maximum likelihood.

Density Estimation Image Generation

Invertible Residual Networks

4 code implementations2 Nov 2018 Jens Behrmann, Will Grathwohl, Ricky T. Q. Chen, David Duvenaud, Jörn-Henrik Jacobsen

We show that standard ResNet architectures can be made invertible, allowing the same model to be used for classification, density estimation, and generation.

Density Estimation General Classification +1

Scalable Recommender Systemsthrough Recursive Evidence Chains

no code implementations20 Oct 2018 Elias Tragas, Calvin Luo, Maxime Yvez, Kevin Luk, David Duvenaud

A popular matrix completion algorithm is matrix factorization, where ratings are predicted from combining learned user and item parameter vectors.

Matrix Completion Recommendation Systems

Towards Understanding Linear Word Analogies

no code implementations ACL 2019 Kawin Ethayarajh, David Duvenaud, Graeme Hirst

A surprising property of word vectors is that word analogies can often be solved with vector arithmetic.

FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models

7 code implementations ICLR 2019 Will Grathwohl, Ricky T. Q. Chen, Jesse Bettencourt, Ilya Sutskever, David Duvenaud

The result is a continuous-time invertible generative model with unbiased density estimation and one-pass sampling, while allowing unrestricted neural network architectures.

 Ranked #1 on Density Estimation on CIFAR-10 (NLL metric)

Density Estimation Image Generation +1

Explaining Image Classifiers by Counterfactual Generation

1 code implementation ICLR 2019 Chun-Hao Chang, Elliot Creager, Anna Goldenberg, David Duvenaud

We can rephrase this question to ask: which parts of the image, if they were not seen by the classifier, would most change its decision?

Image Classification

Scalable Recommender Systems through Recursive Evidence Chains

no code implementations5 Jul 2018 Elias Tragas, Calvin Luo, Maxime Gazeau, Kevin Luk, David Duvenaud

Recommender systems can be formulated as a matrix completion problem, predicting ratings from user and item parameter vectors.

Matrix Completion Recommendation Systems

Stochastic Hyperparameter Optimization through Hypernetworks

1 code implementation ICLR 2018 Jonathan Lorraine, David Duvenaud

Machine learning models are often tuned by nesting optimization of model weights inside the optimization of hyperparameters.

BIG-bench Machine Learning Hyperparameter Optimization +1

Isolating Sources of Disentanglement in Variational Autoencoders

10 code implementations NeurIPS 2018 Ricky T. Q. Chen, Xuechen Li, Roger Grosse, David Duvenaud

We decompose the evidence lower bound to show the existence of a term measuring the total correlation between latent variables.


Inference Suboptimality in Variational Autoencoders

2 code implementations ICML 2018 Chris Cremer, Xuechen Li, David Duvenaud

Furthermore, we show that the parameters used to increase the expressiveness of the approximation play a role in generalizing inference rather than simply improving the complexity of the approximation.

Generating and designing DNA with deep generative models

2 code implementations17 Dec 2017 Nathan Killoran, Leo J. Lee, Andrew Delong, David Duvenaud, Brendan J. Frey

We propose generative neural network methods to generate DNA sequences and tune them to have desired properties.

Noisy Natural Gradient as Variational Inference

2 code implementations ICML 2018 Guodong Zhang, Shengyang Sun, David Duvenaud, Roger Grosse

Variational Bayesian neural nets combine the flexibility of deep learning with Bayesian uncertainty estimation.

Active Learning Efficient Exploration +2

Reinterpreting Importance-Weighted Autoencoders

no code implementations10 Apr 2017 Chris Cremer, Quaid Morris, David Duvenaud

The standard interpretation of importance-weighted autoencoders is that they maximize a tighter lower bound on the marginal likelihood than the standard evidence lower bound.

Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference

1 code implementation NeurIPS 2017 Geoffrey Roeder, Yuhuai Wu, David Duvenaud

We propose a simple and general variant of the standard reparameterized gradient estimator for the variational evidence lower bound.

Variational Inference

Neural networks for the prediction organic chemistry reactions

no code implementations22 Aug 2016 Jennifer N. Wei, David Duvenaud, Alán Aspuru-Guzik

Reaction prediction remains one of the major challenges for organic chemistry, and is a pre-requisite for efficient synthetic planning.

Composing graphical models with neural networks for structured representations and fast inference

3 code implementations NeurIPS 2016 Matthew J. Johnson, David Duvenaud, Alexander B. Wiltschko, Sandeep R. Datta, Ryan P. Adams

We propose a general modeling and inference framework that composes probabilistic graphical models with deep learning methods and combines their respective strengths.

Variational Inference

Early Stopping is Nonparametric Variational Inference

1 code implementation6 Apr 2015 Dougal Maclaurin, David Duvenaud, Ryan P. Adams

By tracking the change in entropy over this sequence of transformations during optimization, we form a scalable, unbiased estimate of the variational lower bound on the log marginal likelihood.

Variational Inference

Gradient-based Hyperparameter Optimization through Reversible Learning

2 code implementations11 Feb 2015 Dougal Maclaurin, David Duvenaud, Ryan P. Adams

Tuning hyperparameters of learning algorithms is hard because gradients are usually unavailable.

Hyperparameter Optimization

Warped Mixtures for Nonparametric Cluster Shapes

1 code implementation9 Aug 2014 Tomoharu Iwata, David Duvenaud, Zoubin Ghahramani

A mixture of Gaussians fit to a single curved or heavy-tailed cluster will report that the data contains many clusters.

Density Estimation

Optimally-Weighted Herding is Bayesian Quadrature

no code implementations9 Aug 2014 Ferenc Huszar, David Duvenaud

We show that the criterion minimised when selecting samples in kernel herding is equivalent to the posterior variance in Bayesian quadrature.

Probabilistic ODE Solvers with Runge-Kutta Means

no code implementations NeurIPS 2014 Michael Schober, David Duvenaud, Philipp Hennig

We construct a family of probabilistic numerical methods that instead return a Gauss-Markov process defining a probability distribution over the ODE solution.

Avoiding pathologies in very deep networks

2 code implementations24 Feb 2014 David Duvenaud, Oren Rippel, Ryan P. Adams, Zoubin Ghahramani

Choosing appropriate architectures and regularization strategies for deep networks is crucial to good predictive performance.

Gaussian Processes

Warped Mixtures for Nonparametric Cluster Shapes

1 code implementation8 Jun 2012 Tomoharu Iwata, David Duvenaud, Zoubin Ghahramani

A mixture of Gaussians fit to a single curved or heavy-tailed cluster will report that the data contains many clusters.

Density Estimation

Optimally-Weighted Herding is Bayesian Quadrature

1 code implementation7 Apr 2012 Ferenc Huszár, David Duvenaud

We show that the criterion minimised when selecting samples in kernel herding is equivalent to the posterior variance in Bayesian quadrature.

Cannot find the paper you are looking for? You can Submit a new open access paper.