Search Results for author: Timothy Mann

Found 21 papers, 8 papers with code

Non-Stationary Bandits with Intermediate Observations

no code implementations ICML 2020 Claire Vernade, András György, Timothy Mann

In fact, if the timescale of the change is comparable to the delay, it is impossible to learn about the environment, since the available observations are already obsolete.

Recommendation Systems

MuZero with Self-competition for Rate Control in VP9 Video Compression

no code implementations14 Feb 2022 Amol Mandhane, Anton Zhernov, Maribeth Rauh, Chenjie Gu, Miaosen Wang, Flora Xue, Wendy Shang, Derek Pang, Rene Claus, Ching-Han Chiang, Cheng Chen, Jingning Han, Angie Chen, Daniel J. Mankowitz, Jackson Broshear, Julian Schrittwieser, Thomas Hubert, Oriol Vinyals, Timothy Mann

Specifically, we target the problem of learning a rate control policy to select the quantization parameters (QP) in the encoding process of libvpx, an open source VP9 video compression library widely used by popular video-on-demand (VOD) services.

Decision Making Quantization +1

Data Augmentation Can Improve Robustness

1 code implementation NeurIPS 2021 Sylvestre-Alvise Rebuffi, Sven Gowal, Dan A. Calian, Florian Stimberg, Olivia Wiles, Timothy Mann

Adversarial training suffers from robust overfitting, a phenomenon where the robust test accuracy starts to decrease during training.

Data Augmentation

Improving Robustness using Generated Data

1 code implementation NeurIPS 2021 Sven Gowal, Sylvestre-Alvise Rebuffi, Olivia Wiles, Florian Stimberg, Dan Andrei Calian, Timothy Mann

Against $\ell_\infty$ norm-bounded perturbations of size $\epsilon = 8/255$, our models achieve 66. 10% and 33. 49% robust accuracy on CIFAR-10 and CIFAR-100, respectively (improving upon the state-of-the-art by +8. 96% and +3. 29%).

Adversarial Robustness

Defending Against Image Corruptions Through Adversarial Augmentations

no code implementations ICLR 2022 Dan A. Calian, Florian Stimberg, Olivia Wiles, Sylvestre-Alvise Rebuffi, Andras Gyorgy, Timothy Mann, Sven Gowal

Modern neural networks excel at image classification, yet they remain vulnerable to common image corruptions such as blur, speckle noise or fog.

Image Classification

Fixing Data Augmentation to Improve Adversarial Robustness

6 code implementations2 Mar 2021 Sylvestre-Alvise Rebuffi, Sven Gowal, Dan A. Calian, Florian Stimberg, Olivia Wiles, Timothy Mann

In particular, against $\ell_\infty$ norm-bounded perturbations of size $\epsilon = 8/255$, our model reaches 64. 20% robust accuracy without using any external data, beating most prior works that use external data.

Adversarial Robustness Data Augmentation

Self-supervised Adversarial Robustness for the Low-label, High-data Regime

no code implementations ICLR 2021 Sven Gowal, Po-Sen Huang, Aaron van den Oord, Timothy Mann, Pushmeet Kohli

Experiments on CIFAR-10 against $\ell_2$ and $\ell_\infty$ norm-bounded perturbations demonstrate that BYORL achieves near state-of-the-art robustness with as little as 500 labeled examples.

Adversarial Robustness Self-Supervised Learning +1

Balancing Constraints and Rewards with Meta-Gradient D4PG

no code implementations ICLR 2021 Dan A. Calian, Daniel J. Mankowitz, Tom Zahavy, Zhongwen Xu, Junhyuk Oh, Nir Levine, Timothy Mann

Deploying Reinforcement Learning (RL) agents to solve real-world applications often requires satisfying complex system constraints.

Reinforcement Learning (RL)

Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples

4 code implementations7 Oct 2020 Sven Gowal, Chongli Qin, Jonathan Uesato, Timothy Mann, Pushmeet Kohli

In the setting with additional unlabeled data, we obtain an accuracy under attack of 65. 88% against $\ell_\infty$ perturbations of size $8/255$ on CIFAR-10 (+6. 35% with respect to prior art).

Adversarial Robustness

Non-Stationary Delayed Bandits with Intermediate Observations

no code implementations3 Jun 2020 Claire Vernade, Andras Gyorgy, Timothy Mann

In fact, if the timescale of the change is comparable to the delay, it is impossible to learn about the environment, since the available observations are already obsolete.

Recommendation Systems

Achieving Robustness in the Wild via Adversarial Mixing with Disentangled Representations

no code implementations CVPR 2020 Sven Gowal, Chongli Qin, Po-Sen Huang, Taylan Cemgil, Krishnamurthy Dvijotham, Timothy Mann, Pushmeet Kohli

Specifically, we leverage the disentangled latent representations computed by a StyleGAN model to generate perturbations of an image that are similar to real-world variations (like adding make-up, or changing the skin-tone of a person) and train models to be invariant to these perturbations.

An Alternative Surrogate Loss for PGD-based Adversarial Testing

4 code implementations21 Oct 2019 Sven Gowal, Jonathan Uesato, Chongli Qin, Po-Sen Huang, Timothy Mann, Pushmeet Kohli

Adversarial testing methods based on Projected Gradient Descent (PGD) are widely used for searching norm-bounded perturbations that cause the inputs of neural networks to be misclassified.

Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates

no code implementations NeurIPS 2019 Hugo Penedones, Carlos Riquelme, Damien Vincent, Hartmut Maennel, Timothy Mann, Andre Barreto, Sylvain Gelly, Gergely Neu

We consider the core reinforcement-learning problem of on-policy value function approximation from a batch of trajectory data, and focus on various issues of Temporal Difference (TD) learning and Monte Carlo (MC) policy evaluation.

Robust Reinforcement Learning for Continuous Control with Model Misspecification

no code implementations ICLR 2020 Daniel J. Mankowitz, Nir Levine, Rae Jeong, Yuanyuan Shi, Jackie Kay, Abbas Abdolmaleki, Jost Tobias Springenberg, Timothy Mann, Todd Hester, Martin Riedmiller

We provide a framework for incorporating robustness -- to perturbations in the transition dynamics which we refer to as model misspecification -- into continuous control Reinforcement Learning (RL) algorithms.

Continuous Control reinforcement-learning +1

A Bayesian Approach to Robust Reinforcement Learning

no code implementations20 May 2019 Esther Derman, Daniel Mankowitz, Timothy Mann, Shie Mannor

Robust Markov Decision Processes (RMDPs) intend to ensure robustness with respect to changing or adversarial system behavior.

reinforcement-learning Reinforcement Learning (RL) +1

On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models

9 code implementations30 Oct 2018 Sven Gowal, Krishnamurthy Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin, Jonathan Uesato, Relja Arandjelovic, Timothy Mann, Pushmeet Kohli

Recent work has shown that it is possible to train deep neural networks that are provably robust to norm-bounded adversarial perturbations.

Temporal Difference Learning with Neural Networks - Study of the Leakage Propagation Problem

no code implementations9 Jul 2018 Hugo Penedones, Damien Vincent, Hartmut Maennel, Sylvain Gelly, Timothy Mann, Andre Barreto

Temporal-Difference learning (TD) [Sutton, 1988] with function approximation can converge to solutions that are worse than those obtained by Monte-Carlo regression, even in the simple case of on-policy evaluation.

A Dual Approach to Scalable Verification of Deep Networks

2 code implementations17 Mar 2018 Krishnamurthy, Dvijotham, Robert Stanforth, Sven Gowal, Timothy Mann, Pushmeet Kohli

In contrast, our framework applies to a general class of activation functions and specifications on neural network inputs and outputs.

valid

Deep Reinforcement Learning in Large Discrete Action Spaces

2 code implementations24 Dec 2015 Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, Ben Coppin

Being able to reason in an environment with a large number of discrete actions is essential to bringing reinforcement learning to a larger class of problems.

Recommendation Systems reinforcement-learning +1

Off-policy evaluation for MDPs with unknown structure

no code implementations11 Feb 2015 Assaf Hallak, François Schnitzler, Timothy Mann, Shie Mannor

Off-policy learning in dynamic decision problems is essential for providing strong evidence that a new policy is better than the one in use.

Off-policy evaluation

Cannot find the paper you are looking for? You can Submit a new open access paper.