Search Results for author: Jonathan Uesato

Found 23 papers, 9 papers with code

Scaling Language Models: Methods, Analysis & Insights from Training Gopher

no code implementations NA 2021 Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor, Irina Higgins, Antonia Creswell, Nat McAleese, Amy Wu, Erich Elsen, Siddhant Jayakumar, Elena Buchatskaya, David Budden, Esme Sutherland, Karen Simonyan, Michela Paganini, Laurent SIfre, Lena Martens, Xiang Lorraine Li, Adhiguna Kuncoro, Aida Nematzadeh, Elena Gribovskaya, Domenic Donato, Angeliki Lazaridou, Arthur Mensch, Jean-Baptiste Lespiau, Maria Tsimpoukelli, Nikolai Grigorev, Doug Fritz, Thibault Sottiaux, Mantas Pajarskas, Toby Pohlen, Zhitao Gong, Daniel Toyama, Cyprien de Masson d'Autume, Yujia Li, Tayfun Terzi, Vladimir Mikulik, Igor Babuschkin, Aidan Clark, Diego de Las Casas, Aurelia Guy, Chris Jones, James Bradbury, Matthew Johnson, Blake Hechtman, Laura Weidinger, Iason Gabriel, William Isaac, Ed Lockhart, Simon Osindero, Laura Rimell, Chris Dyer, Oriol Vinyals, Kareem Ayoub, Jeff Stanway, Lorrayne Bennett, Demis Hassabis, Koray Kavukcuoglu, Geoffrey Irving

Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world.

Fact Checking Language Modelling +3

An Empirical Investigation of Learning from Biased Toxicity Labels

no code implementations4 Oct 2021 Neel Nanda, Jonathan Uesato, Sven Gowal

Collecting annotations from human raters often results in a trade-off between the quantity of labels one wishes to gather and the quality of these labels.

Fairness

Make Sure You're Unsure: A Framework for Verifying Probabilistic Specifications

1 code implementation NeurIPS 2021 Leonard Berrada, Sumanth Dathathri, Krishnamurthy Dvijotham, Robert Stanforth, Rudy Bunel, Jonathan Uesato, Sven Gowal, M. Pawan Kumar

In this direction, we first introduce a general formulation of probabilistic specifications for neural networks, which captures both probabilistic networks (e. g., Bayesian neural networks, MC-Dropout networks) and uncertain inputs (distributions over inputs arising from sensor noise or other perturbations).

Adversarial Robustness OOD Detection

Avoiding Tampering Incentives in Deep RL via Decoupled Approval

no code implementations17 Nov 2020 Jonathan Uesato, Ramana Kumar, Victoria Krakovna, Tom Everitt, Richard Ngo, Shane Legg

How can we design agents that pursue a given objective when all feedback mechanisms are influenceable by the agent?

REALab: An Embedded Perspective on Tampering

no code implementations17 Nov 2020 Ramana Kumar, Jonathan Uesato, Richard Ngo, Tom Everitt, Victoria Krakovna, Shane Legg

Standard Markov Decision Process (MDP) formulations of RL and simulated environments mirroring the MDP structure assume secure access to feedback (e. g., rewards).

reinforcement-learning

Enabling certification of verification-agnostic networks via memory-efficient semidefinite programming

2 code implementations NeurIPS 2020 Sumanth Dathathri, Krishnamurthy Dvijotham, Alexey Kurakin, aditi raghunathan, Jonathan Uesato, Rudy Bunel, Shreya Shankar, Jacob Steinhardt, Ian Goodfellow, Percy Liang, Pushmeet Kohli

In this work, we propose a first-order dual SDP algorithm that (1) requires memory only linear in the total number of network activations, (2) only requires a fixed number of forward/backward passes through the network per iteration.

Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples

4 code implementations7 Oct 2020 Sven Gowal, Chongli Qin, Jonathan Uesato, Timothy Mann, Pushmeet Kohli

In the setting with additional unlabeled data, we obtain an accuracy under attack of 65. 88% against $\ell_\infty$ perturbations of size $8/255$ on CIFAR-10 (+6. 35% with respect to prior art).

Adversarial Robustness

An Alternative Surrogate Loss for PGD-based Adversarial Testing

4 code implementations21 Oct 2019 Sven Gowal, Jonathan Uesato, Chongli Qin, Po-Sen Huang, Timothy Mann, Pushmeet Kohli

Adversarial testing methods based on Projected Gradient Descent (PGD) are widely used for searching norm-bounded perturbations that cause the inputs of neural networks to be misclassified.

Scalable Neural Learning for Verifiable Consistency with Temporal Specifications

no code implementations25 Sep 2019 Sumanth Dathathri, Johannes Welbl, Krishnamurthy (Dj) Dvijotham, Ramana Kumar, Aditya Kanade, Jonathan Uesato, Sven Gowal, Po-Sen Huang, Pushmeet Kohli

Formal verification of machine learning models has attracted attention recently, and significant progress has been made on proving simple properties like robustness to small perturbations of the input features.

Adversarial Robustness Language Modelling

Are Labels Required for Improving Adversarial Robustness?

1 code implementation NeurIPS 2019 Jonathan Uesato, Jean-Baptiste Alayrac, Po-Sen Huang, Robert Stanforth, Alhussein Fawzi, Pushmeet Kohli

Recent work has uncovered the interesting (and somewhat surprising) finding that training models to be invariant to adversarial perturbations requires substantially larger datasets than those required for standard classification.

Adversarial Robustness

Verification of Non-Linear Specifications for Neural Networks

no code implementations ICLR 2019 Chongli Qin, Krishnamurthy, Dvijotham, Brendan O'Donoghue, Rudy Bunel, Robert Stanforth, Sven Gowal, Jonathan Uesato, Grzegorz Swirszcz, Pushmeet Kohli

We show that a number of important properties of interest can be modeled within this class, including conservation of energy in a learned dynamics model of a physical system; semantic consistency of a classifier's output labels under adversarial perturbations and bounding errors in a system that predicts the summation of handwritten digits.

Strength in Numbers: Trading-off Robustness and Computation via Adversarially-Trained Ensembles

no code implementations ICLR 2019 Edward Grefenstette, Robert Stanforth, Brendan O'Donoghue, Jonathan Uesato, Grzegorz Swirszcz, Pushmeet Kohli

We show that increasing the number of parameters in adversarially-trained models increases their robustness, and in particular that ensembling smaller models while adversarially training the entire ensemble as a single model is a more efficient way of spending said budget than simply using a larger single model.

Self-Driving Cars

On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models

9 code implementations30 Oct 2018 Sven Gowal, Krishnamurthy Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin, Jonathan Uesato, Relja Arandjelovic, Timothy Mann, Pushmeet Kohli

Recent work has shown that it is possible to train deep neural networks that are provably robust to norm-bounded adversarial perturbations.

Training verified learners with learned verifiers

no code implementations25 May 2018 Krishnamurthy Dvijotham, Sven Gowal, Robert Stanforth, Relja Arandjelovic, Brendan O'Donoghue, Jonathan Uesato, Pushmeet Kohli

This paper proposes a new algorithmic framework, predictor-verifier training, to train neural networks that are verifiable, i. e., networks that provably satisfy some desired input-output properties.

Semantic Code Repair using Neuro-Symbolic Transformation Networks

no code implementations ICLR 2018 Jacob Devlin, Jonathan Uesato, Rishabh Singh, Pushmeet Kohli

We study the problem of semantic code repair, which can be broadly defined as automatically fixing non-syntactic bugs in source code.

Code Repair

RobustFill: Neural Program Learning under Noisy I/O

3 code implementations ICML 2017 Jacob Devlin, Jonathan Uesato, Surya Bhupatiraju, Rishabh Singh, Abdel-rahman Mohamed, Pushmeet Kohli

Recently, two competing approaches for automatic program learning have received significant attention: (1) neural program synthesis, where a neural network is conditioned on input/output (I/O) examples and learns to generate a program, and (2) neural program induction, where a neural network generates new outputs directly using a latent program representation.

Program induction Program Synthesis

Cannot find the paper you are looking for? You can Submit a new open access paper.