no code implementations • ICML 2020 • Claire Vernade, András György, Timothy Mann

In fact, if the timescale of the change is comparable to the delay, it is impossible to learn about the environment, since the available observations are already obsolete.

no code implementations • 14 Feb 2022 • Amol Mandhane, Anton Zhernov, Maribeth Rauh, Chenjie Gu, Miaosen Wang, Flora Xue, Wendy Shang, Derek Pang, Rene Claus, Ching-Han Chiang, Cheng Chen, Jingning Han, Angie Chen, Daniel J. Mankowitz, Jackson Broshear, Julian Schrittwieser, Thomas Hubert, Oriol Vinyals, Timothy Mann

Specifically, we target the problem of learning a rate control policy to select the quantization parameters (QP) in the encoding process of libvpx, an open source VP9 video compression library widely used by popular video-on-demand (VOD) services.

1 code implementation • NeurIPS 2021 • Sylvestre-Alvise Rebuffi, Sven Gowal, Dan A. Calian, Florian Stimberg, Olivia Wiles, Timothy Mann

Adversarial training suffers from robust overfitting, a phenomenon where the robust test accuracy starts to decrease during training.

1 code implementation • NeurIPS 2021 • Sven Gowal, Sylvestre-Alvise Rebuffi, Olivia Wiles, Florian Stimberg, Dan Andrei Calian, Timothy Mann

Against $\ell_\infty$ norm-bounded perturbations of size $\epsilon = 8/255$, our models achieve 66. 10% and 33. 49% robust accuracy on CIFAR-10 and CIFAR-100, respectively (improving upon the state-of-the-art by +8. 96% and +3. 29%).

no code implementations • ICLR 2022 • Dan A. Calian, Florian Stimberg, Olivia Wiles, Sylvestre-Alvise Rebuffi, Andras Gyorgy, Timothy Mann, Sven Gowal

Modern neural networks excel at image classification, yet they remain vulnerable to common image corruptions such as blur, speckle noise or fog.

7 code implementations • 2 Mar 2021 • Sylvestre-Alvise Rebuffi, Sven Gowal, Dan A. Calian, Florian Stimberg, Olivia Wiles, Timothy Mann

In particular, against $\ell_\infty$ norm-bounded perturbations of size $\epsilon = 8/255$, our model reaches 64. 20% robust accuracy without using any external data, beating most prior works that use external data.

no code implementations • ICLR 2021 • Sven Gowal, Po-Sen Huang, Aaron van den Oord, Timothy Mann, Pushmeet Kohli

Experiments on CIFAR-10 against $\ell_2$ and $\ell_\infty$ norm-bounded perturbations demonstrate that BYORL achieves near state-of-the-art robustness with as little as 500 labeled examples.

no code implementations • 20 Oct 2020 • Daniel J. Mankowitz, Dan A. Calian, Rae Jeong, Cosmin Paduraru, Nicolas Heess, Sumanth Dathathri, Martin Riedmiller, Timothy Mann

Many real-world physical control systems are required to satisfy constraints upon deployment.

no code implementations • ICLR 2021 • Dan A. Calian, Daniel J. Mankowitz, Tom Zahavy, Zhongwen Xu, Junhyuk Oh, Nir Levine, Timothy Mann

Deploying Reinforcement Learning (RL) agents to solve real-world applications often requires satisfying complex system constraints.

4 code implementations • 7 Oct 2020 • Sven Gowal, Chongli Qin, Jonathan Uesato, Timothy Mann, Pushmeet Kohli

In the setting with additional unlabeled data, we obtain an accuracy under attack of 65. 88% against $\ell_\infty$ perturbations of size $8/255$ on CIFAR-10 (+6. 35% with respect to prior art).

no code implementations • 3 Jun 2020 • Claire Vernade, Andras Gyorgy, Timothy Mann

In fact, if the timescale of the change is comparable to the delay, it is impossible to learn about the environment, since the available observations are already obsolete.

no code implementations • CVPR 2020 • Sven Gowal, Chongli Qin, Po-Sen Huang, Taylan Cemgil, Krishnamurthy Dvijotham, Timothy Mann, Pushmeet Kohli

Specifically, we leverage the disentangled latent representations computed by a StyleGAN model to generate perturbations of an image that are similar to real-world variations (like adding make-up, or changing the skin-tone of a person) and train models to be invariant to these perturbations.

4 code implementations • 21 Oct 2019 • Sven Gowal, Jonathan Uesato, Chongli Qin, Po-Sen Huang, Timothy Mann, Pushmeet Kohli

Adversarial testing methods based on Projected Gradient Descent (PGD) are widely used for searching norm-bounded perturbations that cause the inputs of neural networks to be misclassified.

no code implementations • NeurIPS 2019 • Hugo Penedones, Carlos Riquelme, Damien Vincent, Hartmut Maennel, Timothy Mann, Andre Barreto, Sylvain Gelly, Gergely Neu

We consider the core reinforcement-learning problem of on-policy value function approximation from a batch of trajectory data, and focus on various issues of Temporal Difference (TD) learning and Monte Carlo (MC) policy evaluation.

no code implementations • ICLR 2020 • Daniel J. Mankowitz, Nir Levine, Rae Jeong, Yuanyuan Shi, Jackie Kay, Abbas Abdolmaleki, Jost Tobias Springenberg, Timothy Mann, Todd Hester, Martin Riedmiller

We provide a framework for incorporating robustness -- to perturbations in the transition dynamics which we refer to as model misspecification -- into continuous control Reinforcement Learning (RL) algorithms.

no code implementations • 20 May 2019 • Esther Derman, Daniel Mankowitz, Timothy Mann, Shie Mannor

Robust Markov Decision Processes (RMDPs) intend to ensure robustness with respect to changing or adversarial system behavior.

9 code implementations • 30 Oct 2018 • Sven Gowal, Krishnamurthy Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin, Jonathan Uesato, Relja Arandjelovic, Timothy Mann, Pushmeet Kohli

Recent work has shown that it is possible to train deep neural networks that are provably robust to norm-bounded adversarial perturbations.

no code implementations • 9 Jul 2018 • Hugo Penedones, Damien Vincent, Hartmut Maennel, Sylvain Gelly, Timothy Mann, Andre Barreto

Temporal-Difference learning (TD) [Sutton, 1988] with function approximation can converge to solutions that are worse than those obtained by Monte-Carlo regression, even in the simple case of on-policy evaluation.

2 code implementations • 17 Mar 2018 • Krishnamurthy, Dvijotham, Robert Stanforth, Sven Gowal, Timothy Mann, Pushmeet Kohli

In contrast, our framework applies to a general class of activation functions and specifications on neural network inputs and outputs.

2 code implementations • 24 Dec 2015 • Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, Ben Coppin

Being able to reason in an environment with a large number of discrete actions is essential to bringing reinforcement learning to a larger class of problems.

no code implementations • 11 Feb 2015 • Assaf Hallak, François Schnitzler, Timothy Mann, Shie Mannor

Off-policy learning in dynamic decision problems is essential for providing strong evidence that a new policy is better than the one in use.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.