Search Results for author: David Krueger

Found 31 papers, 15 papers with code

Investigating the Nature of 3D Generalization in Deep Neural Networks

1 code implementation19 Apr 2023 Shoaib Ahmed Siddiqui, David Krueger, Thomas Breuel

Modern deep learning architectures for object recognition generalize well to novel views, but the mechanisms are not well understood.

Object Recognition

Unifying Grokking and Double Descent

1 code implementation10 Mar 2023 Xander Davies, Lauro Langosco, David Krueger

A principled understanding of generalization in deep learning may require unifying disparate observations under a single conceptual framework.

On The Fragility of Learned Reward Functions

no code implementations9 Jan 2023 Lev McKinney, Yawen Duan, David Krueger, Adam Gleave

Our work focuses on demonstrating and studying the causes of these relearning failures in the domain of preference-based reward learning.

Continuous Control

Domain Generalization for Robust Model-Based Offline Reinforcement Learning

no code implementations27 Nov 2022 Alan Clark, Shoaib Ahmed Siddiqui, Robert Kirk, Usman Anwar, Stephen Chung, David Krueger

Existing offline reinforcement learning (RL) algorithms typically assume that training data is either: 1) generated by a known policy, or 2) of entirely unknown origin.

Domain Generalization Offline RL +2

Mechanistic Mode Connectivity

1 code implementation15 Nov 2022 Ekdeep Singh Lubana, Eric J. Bigelow, Robert P. Dick, David Krueger, Hidenori Tanaka

We study neural network loss landscapes through the lens of mode connectivity, the observation that minimizers of neural networks retrieved via training on a dataset are connected via simple paths of low loss.

Broken Neural Scaling Laws

1 code implementation26 Oct 2022 Ethan Caballero, Kshitij Gupta, Irina Rish, David Krueger

Moreover, this functional form accurately models and extrapolates scaling behavior that other functional forms are incapable of expressing such as the non-monotonic transitions present in the scaling behavior of phenomena such as double descent and the delayed, sharp inflection points present in the scaling behavior of tasks such as arithmetic.

Adversarial Robustness Continual Learning +7

Towards Out-of-Distribution Adversarial Robustness

no code implementations6 Oct 2022 Adam Ibrahim, Charles Guille-Escuret, Ioannis Mitliagkas, Irina Rish, David Krueger, Pouya Bashivan

Compared to existing methods, we obtain similar or superior worst-case adversarial robustness on attacks seen during training.

Adversarial Robustness

Defining and Characterizing Reward Hacking

no code implementations27 Sep 2022 Joar Skalse, Nikolaus H. R. Howe, Dmitrii Krasheninnikov, David Krueger

We provide the first formal definition of reward hacking, a phenomenon where optimizing an imperfect proxy reward function, $\mathcal{\tilde{R}}$, leads to poor performance according to the true reward function, $\mathcal{R}$.

Revealing the Incentive to Cause Distributional Shift

no code implementations29 Sep 2021 David Krueger, Tegan Maharaj, Jan Leike

We use these unit tests to demonstrate that changes to the learning algorithm (e. g. introducing meta-learning) can cause previously hidden incentives to be revealed, resulting in qualitatively different behaviour despite no change in performance metric.


Goal Misgeneralization in Deep Reinforcement Learning

2 code implementations28 May 2021 Lauro Langosco, Jack Koch, Lee Sharkey, Jacob Pfau, Laurent Orseau, David Krueger

We study goal misgeneralization, a type of out-of-distribution generalization failure in reinforcement learning (RL).

Navigate Out-of-Distribution Generalization +2

Active Reinforcement Learning: Observing Rewards at a Cost

no code implementations13 Nov 2020 David Krueger, Jan Leike, Owain Evans, John Salvatier

Active reinforcement learning (ARL) is a variant on reinforcement learning where the agent does not observe the reward unless it chooses to pay a query cost c > 0.

Multi-Armed Bandits reinforcement-learning +1

Hidden Incentives for Auto-Induced Distributional Shift

no code implementations19 Sep 2020 David Krueger, Tegan Maharaj, Jan Leike

We introduce the term auto-induced distributional shift (ADS) to describe the phenomenon of an algorithm causing a change in the distribution of its own inputs.

BIG-bench Machine Learning Meta-Learning +1

AI Research Considerations for Human Existential Safety (ARCHES)

no code implementations30 May 2020 Andrew Critch, David Krueger

Framed in positive terms, this report examines how technical AI research might be steered in a manner that is more attentive to humanity's long-term prospects for survival as a species.

Scalable agent alignment via reward modeling: a research direction

3 code implementations19 Nov 2018 Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, Shane Legg

One obstacle to applying reinforcement learning algorithms to real-world problems is the lack of suitable reward functions.

Atari Games reinforcement-learning +1

Neural Autoregressive Flows

5 code implementations ICML 2018 Chin-wei Huang, David Krueger, Alexandre Lacoste, Aaron Courville

Normalizing flows and autoregressive models have been successfully combined to produce state-of-the-art results in density estimation, via Masked Autoregressive Flows (MAF), and to accelerate state-of-the-art WaveNet-based speech synthesis to 20x faster than real-time, via Inverse Autoregressive Flows (IAF).

Density Estimation Speech Synthesis

Nested LSTMs

1 code implementation31 Jan 2018 Joel Ruben Antony Moniz, David Krueger

We propose Nested LSTMs (NLSTM), a novel RNN architecture with multiple levels of memory.

Language Modelling

Deep Prior

no code implementations13 Dec 2017 Alexandre Lacoste, Thomas Boquet, Negar Rostamzadeh, Boris Oreshkin, Wonchang Chung, David Krueger

The recent literature on deep learning offers new tools to learn a rich probability distribution over high dimensional data such as images or sounds.

Regularizing RNNs by Stabilizing Activations

1 code implementation26 Nov 2015 David Krueger, Roland Memisevic

We stabilize the activations of Recurrent Neural Networks (RNNs) by penalizing the squared distance between successive hidden states' norms.

Language Modelling

NICE: Non-linear Independent Components Estimation

18 code implementations30 Oct 2014 Laurent Dinh, David Krueger, Yoshua Bengio

It is based on the idea that a good representation is one in which the data has a distribution that is easy to model.

Ranked #69 on Image Generation on CIFAR-10 (bits/dimension metric)

Image Generation

Zero-bias autoencoders and the benefits of co-adapting features

no code implementations13 Feb 2014 Kishore Konda, Roland Memisevic, David Krueger

We show that negative biases are a natural result of using a hidden layer whose responsibility is to both represent the input data and act as a selection mechanism that ensures sparsity of the representation.

Cannot find the paper you are looking for? You can Submit a new open access paper.