Search Results for author: Tim Rocktäschel

Found 82 papers, 48 papers with code

Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts

no code implementations26 Feb 2024 Mikayel Samvelyan, Sharath Chandra Raparthy, Andrei Lupu, Eric Hambro, Aram H. Markosyan, Manish Bhatt, Yuning Mao, Minqi Jiang, Jack Parker-Holder, Jakob Foerster, Tim Rocktäschel, Roberta Raileanu

As large language models (LLMs) become increasingly prevalent across many real-world applications, understanding and enhancing their robustness to user inputs is of paramount importance.

Question Answering

Scaling Opponent Shaping to High Dimensional Games

no code implementations19 Dec 2023 Akbir Khan, Timon Willi, Newton Kwan, Andrea Tacchetti, Chris Lu, Edward Grefenstette, Tim Rocktäschel, Jakob Foerster

In multi-agent settings with mixed incentives, methods developed for zero-sum games have been shown to lead to detrimental outcomes.

Meta-Learning

Leading the Pack: N-player Opponent Shaping

no code implementations19 Dec 2023 Alexandra Souly, Timon Willi, Akbir Khan, Robert Kirk, Chris Lu, Edward Grefenstette, Tim Rocktäschel

We evaluate on over 4 different environments, varying the number of players from 3 to 5, and demonstrate that model-based OS methods converge to equilibrium with better global welfare than naive learning.

H-GAP: Humanoid Control with a Generalist Planner

no code implementations5 Dec 2023 Zhengyao Jiang, Yingchen Xu, Nolan Wagener, Yicheng Luo, Michael Janner, Edward Grefenstette, Tim Rocktäschel, Yuandong Tian

However, the extensive collection of human motion-captured data and the derived datasets of humanoid trajectories, such as MoCapAct, paves the way to tackle these challenges.

Humanoid Control Model Predictive Control +1

minimax: Efficient Baselines for Autocurricula in JAX

1 code implementation21 Nov 2023 Minqi Jiang, Michael Dennis, Edward Grefenstette, Tim Rocktäschel

This compute requirement is a major obstacle to rapid innovation for the field.

Decision Making

Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks

no code implementations21 Nov 2023 Samyak Jain, Robert Kirk, Ekdeep Singh Lubana, Robert P. Dick, Hidenori Tanaka, Edward Grefenstette, Tim Rocktäschel, David Scott Krueger

Fine-tuning large pre-trained models has become the de facto strategy for developing both task-specific and general-purpose machine learning systems, including developing models that are safe to deploy.

Network Pruning

Mix-ME: Quality-Diversity for Multi-Agent Learning

no code implementations3 Nov 2023 Garðar Ingvarsson, Mikayel Samvelyan, Bryan Lim, Manon Flageat, Antoine Cully, Tim Rocktäschel

In many real-world systems, such as adaptive robotics, achieving a single, optimised solution may be insufficient.

Continuous Control

Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution

1 code implementation28 Sep 2023 Chrisantha Fernando, Dylan Banarse, Henryk Michalewski, Simon Osindero, Tim Rocktäschel

Driven by an LLM, Promptbreeder mutates a population of task-prompts, and subsequently evaluates them for fitness on a training set.

Stabilizing Unsupervised Environment Design with a Learned Adversary

1 code implementation21 Aug 2023 Ishita Mediratta, Minqi Jiang, Jack Parker-Holder, Michael Dennis, Eugene Vinitsky, Tim Rocktäschel

As a result, we make it possible for PAIRED to match or exceed state-of-the-art methods, producing robust agents in several established challenging procedurally-generated environments, including a partially-observed maze navigation task and a continuous-control car racing environment.

Car Racing Reinforcement Learning (RL)

MAESTRO: Open-Ended Environment Design for Multi-Agent Reinforcement Learning

no code implementations6 Mar 2023 Mikayel Samvelyan, Akbir Khan, Michael Dennis, Minqi Jiang, Jack Parker-Holder, Jakob Foerster, Roberta Raileanu, Tim Rocktäschel

Open-ended learning methods that automatically generate a curriculum of increasingly challenging tasks serve as a promising avenue toward generally capable reinforcement learning agents.

Continuous Control Multi-agent Reinforcement Learning +2

General Intelligence Requires Rethinking Exploration

no code implementations15 Nov 2022 Minqi Jiang, Tim Rocktäschel, Edward Grefenstette

We are at the cusp of a transition from "learning from data" to "learning what data to learn from" as a central focus of artificial intelligence (AI) research.

reinforcement-learning Reinforcement Learning (RL)

Dungeons and Data: A Large-Scale NetHack Dataset

1 code implementation1 Nov 2022 Eric Hambro, Roberta Raileanu, Danielle Rothermel, Vegard Mella, Tim Rocktäschel, Heinrich Küttler, Naila Murray

Recent breakthroughs in the development of agents to solve challenging sequential decision making problems such as Go, StarCraft, or DOTA, have relied on both simulated environments and large-scale datasets.

Decision Making NetHack +2

The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs

1 code implementation NeurIPS 2023 Laura Ruis, Akbir Khan, Stella Biderman, Sara Hooker, Tim Rocktäschel, Edward Grefenstette

We present our findings as the starting point for further research into evaluating how LLMs interpret language in context and to drive the development of more pragmatic and useful models of human discourse.

Exploration via Elliptical Episodic Bonuses

3 code implementations11 Oct 2022 Mikael Henaff, Roberta Raileanu, Minqi Jiang, Tim Rocktäschel

In recent years, a number of reinforcement learning (RL) methods have been proposed to explore complex environments which differ across episodes.

Reinforcement Learning (RL)

Efficient Planning in a Compact Latent Action Space

1 code implementation22 Aug 2022 Zhengyao Jiang, Tianjun Zhang, Michael Janner, Yueying Li, Tim Rocktäschel, Edward Grefenstette, Yuandong Tian

Planning-based reinforcement learning has shown strong performance in tasks in discrete and low-dimensional continuous action spaces.

Continuous Control Decision Making +1

Hierarchical Kickstarting for Skill Transfer in Reinforcement Learning

2 code implementations23 Jul 2022 Michael Matthews, Mikayel Samvelyan, Jack Parker-Holder, Edward Grefenstette, Tim Rocktäschel

In this paper, we investigate how skills can be incorporated into the training of reinforcement learning (RL) agents in complex environments with large state-action spaces and sparse rewards.

Inductive Bias NetHack +2

GriddlyJS: A Web IDE for Reinforcement Learning

no code implementations13 Jul 2022 Christopher Bamford, Minqi Jiang, Mikayel Samvelyan, Tim Rocktäschel

Progress in reinforcement learning (RL) research is often driven by the design of new, challenging environments -- a costly undertaking requiring skills orthogonal to that of a typical machine learning researcher.

Offline RL reinforcement-learning +1

Grounding Aleatoric Uncertainty for Unsupervised Environment Design

1 code implementation11 Jul 2022 Minqi Jiang, Michael Dennis, Jack Parker-Holder, Andrei Lupu, Heinrich Küttler, Edward Grefenstette, Tim Rocktäschel, Jakob Foerster

Problematically, in partially-observable or stochastic settings, optimal policies may depend on the ground-truth distribution over aleatoric parameters of the environment in the intended deployment setting, while curriculum learning necessarily shifts the training distribution.

Reinforcement Learning (RL)

Graph Backup: Data Efficient Backup Exploiting Markovian Transitions

1 code implementation31 May 2022 Zhengyao Jiang, Tianjun Zhang, Robert Kirk, Tim Rocktäschel, Edward Grefenstette

In this paper, we treat the transition data of the MDP as a graph, and define a novel backup operator, Graph Backup, which exploits this graph structure for better value estimation.

Atari Games counterfactual +2

Evolving Curricula with Regret-Based Environment Design

3 code implementations2 Mar 2022 Jack Parker-Holder, Minqi Jiang, Michael Dennis, Mikayel Samvelyan, Jakob Foerster, Edward Grefenstette, Tim Rocktäschel

Our approach, which we call Adversarially Compounding Complexity by Editing Levels (ACCEL), seeks to constantly produce levels at the frontier of an agent's capabilities, resulting in curricula that start simple but become increasingly complex.

Reinforcement Learning (RL)

Generalization in Cooperative Multi-Agent Systems

no code implementations31 Jan 2022 Anuj Mahajan, Mikayel Samvelyan, Tarun Gupta, Benjamin Ellis, Mingfei Sun, Tim Rocktäschel, Shimon Whiteson

Specifically, we study generalization bounds under a linear dependence of the underlying dynamics on the agent capabilities, which can be seen as a generalization of Successor Features to MAS.

Generalization Bounds Multi-agent Reinforcement Learning

Replay-Guided Adversarial Environment Design

4 code implementations NeurIPS 2021 Minqi Jiang, Michael Dennis, Jack Parker-Holder, Jakob Foerster, Edward Grefenstette, Tim Rocktäschel

Furthermore, our theory suggests a highly counterintuitive improvement to PLR: by stopping the agent from updating its policy on uncurated levels (training on less data), we can improve the convergence to Nash equilibria.

Reinforcement Learning (RL)

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

1 code implementation27 Sep 2021 Mikayel Samvelyan, Robert Kirk, Vitaly Kurin, Jack Parker-Holder, Minqi Jiang, Eric Hambro, Fabio Petroni, Heinrich Küttler, Edward Grefenstette, Tim Rocktäschel

By leveraging the full set of entities and environment dynamics from NetHack, one of the richest grid-based video games, MiniHack allows designing custom RL testbeds that are fast and convenient to use.

NetHack reinforcement-learning +2

Return Dispersion as an Estimator of Learning Potential for Prioritized Level Replay

no code implementations NeurIPS Workshop ICBINB 2021 Iryna Korshunova, Minqi Jiang, Jack Parker-Holder, Tim Rocktäschel, Edward Grefenstette

Prioritized Level Replay (PLR) has been shown to induce adaptive curricula that improve the sample-efficiency and generalization of reinforcement learning policies in environments featuring multiple tasks or levels.

reinforcement-learning Reinforcement Learning (RL)

Don't Sweep your Learning Rate under the Rug: A Closer Look at Cross-modal Transfer of Pretrained Transformers

no code implementations26 Jul 2021 Danielle Rothermel, Margaret Li, Tim Rocktäschel, Jakob Foerster

After carefully redesigning the empirical setup, we find that when tuning learning rates properly, pretrained transformers do outperform or match training from scratch in all of our tasks, but only as long as the entire model is finetuned.

Prioritized Level Replay

4 code implementations8 Oct 2020 Minqi Jiang, Edward Grefenstette, Tim Rocktäschel

Environments with procedurally generated content serve as important benchmarks for testing systematic generalization in deep reinforcement learning.

Systematic Generalization

WordCraft: An Environment for Benchmarking Commonsense Agents

1 code implementation ICML Workshop LaReL 2020 Minqi Jiang, Jelena Luketina, Nantas Nardelli, Pasquale Minervini, Philip H. S. Torr, Shimon Whiteson, Tim Rocktäschel

This is partly due to the lack of lightweight simulation environments that sufficiently reflect the semantics of the real world and provide knowledge sources grounded with respect to observations in an RL environment.

Benchmarking Knowledge Graphs +2

Learning Reasoning Strategies in End-to-End Differentiable Proving

2 code implementations ICML 2020 Pasquale Minervini, Sebastian Riedel, Pontus Stenetorp, Edward Grefenstette, Tim Rocktäschel

Attempts to render deep learning models interpretable, data-efficient, and robust have seen some success through hybridisation with rule-based systems, for example, in Neural Theorem Provers (NTPs).

Link Prediction Relational Reasoning

The NetHack Learning Environment

3 code implementations NeurIPS 2020 Heinrich Küttler, Nantas Nardelli, Alexander H. Miller, Roberta Raileanu, Marco Selvatici, Edward Grefenstette, Tim Rocktäschel

Here, we present the NetHack Learning Environment (NLE), a scalable, procedurally generated, stochastic, rich, and challenging environment for RL research based on the popular single-player terminal-based roguelike game, NetHack.

NetHack Score Reinforcement Learning (RL) +1

How Context Affects Language Models' Factual Predictions

no code implementations AKBC 2020 Fabio Petroni, Patrick Lewis, Aleksandra Piktus, Tim Rocktäschel, Yuxiang Wu, Alexander H. Miller, Sebastian Riedel

When pre-trained on large unsupervised textual corpora, language models are able to store and retrieve factual knowledge to some extent, making it possible to use them directly for zero-shot cloze-style question answering.

Information Retrieval Language Modelling +4

RTFM: Generalising to New Environment Dynamics via Reading

no code implementations ICLR 2020 Victor Zhong, Tim Rocktäschel, Edward Grefenstette

In this work, we demonstrate that language understanding via a reading policy learner is a promising vehicle for generalisation to new environments.

Avoiding the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training

1 code implementation EMNLP 2020 Joe Stacey, Pasquale Minervini, Haim Dubossarsky, Sebastian Riedel, Tim Rocktäschel

Natural Language Inference (NLI) datasets contain annotation artefacts resulting in spurious correlations between the natural language utterances and their respective entailment classes.

Natural Language Inference Sentence

RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments

2 code implementations ICLR 2020 Roberta Raileanu, Tim Rocktäschel

However, we show that existing methods fall short in procedurally-generated environments where an agent is unlikely to visit a state more than once.

Differentiable Reasoning on Large Knowledge Bases and Natural Language

3 code implementations17 Dec 2019 Pasquale Minervini, Matko Bošnjak, Tim Rocktäschel, Sebastian Riedel, Edward Grefenstette

Reasoning with knowledge expressed in natural language and Knowledge Bases (KBs) is a major challenge for Artificial Intelligence, with applications in machine reading, dialogue, and question answering.

Link Prediction Question Answering +1

RTFM: Generalising to Novel Environment Dynamics via Reading

1 code implementation18 Oct 2019 Victor Zhong, Tim Rocktäschel, Edward Grefenstette

In this work, we demonstrate that language understanding via a reading policy learner is a promising vehicle for generalisation to new environments.

NLProlog: Reasoning with Weak Unification for Question Answering in Natural Language

1 code implementation ACL 2019 Leon Weber, Pasquale Minervini, Jannes Münchmeyer, Ulf Leser, Tim Rocktäschel

In contrast, neural models can cope very well with ambiguity by learning distributed representations of words and their composition from data, but lead to models that are difficult to interpret.

Question Answering Sentence

A Survey of Reinforcement Learning Informed by Natural Language

no code implementations10 Jun 2019 Jelena Luketina, Nantas Nardelli, Gregory Farquhar, Jakob Foerster, Jacob Andreas, Edward Grefenstette, Shimon Whiteson, Tim Rocktäschel

To be successful in real-world tasks, Reinforcement Learning (RL) needs to exploit the compositional, relational, and hierarchical structure of the world, and learn to transfer it to the task at hand.

Decision Making Instruction Following +5

Scalable Neural Theorem Proving on Knowledge Bases and Natural Language

no code implementations ICLR 2019 Pasquale Minervini, Matko Bosnjak, Tim Rocktäschel, Edward Grefenstette, Sebastian Riedel

Reasoning over text and Knowledge Bases (KBs) is a major challenge for Artificial Intelligence, with applications in machine reading, dialogue, and question answering.

Automated Theorem Proving Link Prediction +2

NLProlog: Reasoning with Weak Unification for Natural Language Question Answering

no code implementations ICLR 2019 Leon Weber, Pasquale Minervini, Ulf Leser, Tim Rocktäschel

Currently, most work in natural language processing focuses on neural networks which learn distributed representations of words and their composition, thereby performing well in the presence of large linguistic variability.

Question Answering Sentence

Learning to Speak and Act in a Fantasy Text Adventure Game

1 code implementation IJCNLP 2019 Jack Urbanek, Angela Fan, Siddharth Karamcheti, Saachi Jain, Samuel Humeau, Emily Dinan, Tim Rocktäschel, Douwe Kiela, Arthur Szlam, Jason Weston

We analyze the ingredients necessary for successful grounding in this setting, and how each of these factors relate to agents that can talk and act successfully.

Retrieval

e-SNLI: Natural Language Inference with Natural Language Explanations

2 code implementations NeurIPS 2018 Oana-Maria Camburu, Tim Rocktäschel, Thomas Lukasiewicz, Phil Blunsom

In order for machine learning to garner widespread public adoption, models must be able to provide interpretable and robust explanations for their decisions, as well as learn from human-provided explanations at train time.

Natural Language Inference Sentence

Stable Opponent Shaping in Differentiable Games

no code implementations ICLR 2019 Alistair Letcher, Jakob Foerster, David Balduzzi, Tim Rocktäschel, Shimon Whiteson

A growing number of learning methods are actually differentiable games whose players optimise multiple, interdependent objectives in parallel -- from GANs and intrinsic curiosity to multi-agent RL.

Towards Neural Theorem Proving at Scale

no code implementations21 Jul 2018 Pasquale Minervini, Matko Bosnjak, Tim Rocktäschel, Sebastian Riedel

Neural models combining representation learning and reasoning in an end-to-end trainable manner are receiving increasing interest.

Automated Theorem Proving Representation Learning

DiCE: The Infinitely Differentiable Monte Carlo Estimator

1 code implementation ICML 2018 Jakob Foerster, Gregory Farquhar, Maruan Al-Shedivat, Tim Rocktäschel, Eric Xing, Shimon Whiteson

Lastly, to match the first-order gradient under differentiation, SL treats part of the cost as a fixed sample, which we show leads to missing and wrong terms for estimators of higher-order derivatives.

Meta-Learning

Jack the Reader - A Machine Reading Framework

2 code implementations20 Jun 2018 Dirk Weissenborn, Pasquale Minervini, Tim Dettmers, Isabelle Augenstein, Johannes Welbl, Tim Rocktäschel, Matko Bošnjak, Jeff Mitchell, Thomas Demeester, Pontus Stenetorp, Sebastian Riedel

For example, in Question Answering, the supporting text can be newswire or Wikipedia articles; in Natural Language Inference, premises can be seen as the supporting text and hypotheses as questions.

Link Prediction Natural Language Inference +3

DiCE: The Infinitely Differentiable Monte-Carlo Estimator

5 code implementations14 Feb 2018 Jakob Foerster, Gregory Farquhar, Maruan Al-Shedivat, Tim Rocktäschel, Eric P. Xing, Shimon Whiteson

Lastly, to match the first-order gradient under differentiation, SL treats part of the cost as a fixed sample, which we show leads to missing and wrong terms for estimators of higher-order derivatives.

Meta-Learning

Combining Representation Learning with Logic for Language Processing

no code implementations27 Dec 2017 Tim Rocktäschel

The current state-of-the-art in many natural language processing and automated knowledge base completion tasks is held by representation learning methods which learn distributed vector representations of symbols via gradient-based optimization.

Formal Logic Knowledge Base Completion +1

TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning

1 code implementation ICLR 2018 Gregory Farquhar, Tim Rocktäschel, Maximilian Igl, Shimon Whiteson

To address these challenges, we propose TreeQN, a differentiable, recursive, tree-structured model that serves as a drop-in replacement for any value function network in deep RL with discrete actions.

Atari Games reinforcement-learning +2

Adversarial Sets for Regularising Neural Link Predictors

1 code implementation24 Jul 2017 Pasquale Minervini, Thomas Demeester, Tim Rocktäschel, Sebastian Riedel

The training objective is defined as a minimax problem, where an adversary finds the most offending adversarial examples by maximising the inconsistency loss, and the model is trained by jointly minimising a supervised loss and the inconsistency loss on the adversarial examples.

Link Prediction Relational Reasoning

End-to-End Differentiable Proving

3 code implementations NeurIPS 2017 Tim Rocktäschel, Sebastian Riedel

We introduce neural networks for end-to-end differentiable proving of queries to knowledge bases by operating on dense vector representations of symbols.

Link Prediction

Frustratingly Short Attention Spans in Neural Language Modeling

no code implementations15 Feb 2017 Michał Daniluk, Tim Rocktäschel, Johannes Welbl, Sebastian Riedel

This vector is used both for predicting the next token as well as for the key and value of a differentiable memory of a token history.

Language Modelling

Learning Python Code Suggestion with a Sparse Pointer Network

5 code implementations24 Nov 2016 Avishkar Bhoopchand, Tim Rocktäschel, Earl Barr, Sebastian Riedel

By augmenting a neural language model with a pointer network specialized in referring to predefined classes of identifiers, we obtain a much lower perplexity and a 5 percentage points increase in accuracy for code suggestion compared to an LSTM baseline.

Language Modelling

emoji2vec: Learning Emoji Representations from their Description

7 code implementations WS 2016 Ben Eisner, Tim Rocktäschel, Isabelle Augenstein, Matko Bošnjak, Sebastian Riedel

Many current natural language processing applications for social media rely on representation learning and utilize pre-trained word embeddings.

Representation Learning Sentiment Analysis +1

Lifted Rule Injection for Relation Embeddings

no code implementations EMNLP 2016 Thomas Demeester, Tim Rocktäschel, Sebastian Riedel

Methods based on representation learning currently hold the state-of-the-art in many natural language processing and knowledge base inference tasks.

Relation Representation Learning

Stance Detection with Bidirectional Conditional Encoding

1 code implementation EMNLP 2016 Isabelle Augenstein, Tim Rocktäschel, Andreas Vlachos, Kalina Bontcheva

Stance detection is the task of classifying the attitude expressed in a text towards a target such as Hillary Clinton to be "positive", negative" or "neutral".

Stance Detection

MuFuRU: The Multi-Function Recurrent Unit

no code implementations9 Jun 2016 Dirk Weissenborn, Tim Rocktäschel

Recurrent neural networks such as the GRU and LSTM found wide adoption in natural language processing and achieve state-of-the-art results for many tasks.

Language Modelling Sentiment Analysis

Generating Natural Language Inference Chains

no code implementations4 Jun 2016 Vladyslav Kolesnyk, Tim Rocktäschel, Sebastian Riedel

We take entailment-pairs of the Stanford Natural Language Inference corpus and train an LSTM with attention.

Machine Translation Natural Language Inference +2

Programming with a Differentiable Forth Interpreter

1 code implementation ICML 2017 Matko Bošnjak, Tim Rocktäschel, Jason Naradowsky, Sebastian Riedel

Given that in practice training data is scarce for all but a small set of problems, a core question is how to incorporate prior knowledge into a model.

Reasoning about Entailment with Neural Attention

7 code implementations22 Sep 2015 Tim Rocktäschel, Edward Grefenstette, Karl Moritz Hermann, Tomáš Kočiský, Phil Blunsom

We extend this model with a word-by-word neural attention mechanism that encourages reasoning over entailments of pairs of words and phrases.

Natural Language Inference

Cannot find the paper you are looking for? You can Submit a new open access paper.