no code implementations • 28 Sep 2023 • Chrisantha Fernando, Dylan Banarse, Henryk Michalewski, Simon Osindero, Tim Rocktäschel
Driven by an LLM, Promptbreeder mutates a population of task-prompts, and subsequently evaluates them for fitness on a training set.
1 code implementation • 21 Aug 2023 • Ishita Mediratta, Minqi Jiang, Jack Parker-Holder, Michael Dennis, Eugene Vinitsky, Tim Rocktäschel
As a result, we make it possible for PAIRED to match or exceed state-of-the-art methods, producing robust agents in several established challenging procedurally-generated environments, including a partially-observed maze navigation task and a continuous-control car racing environment.
no code implementations • 6 Mar 2023 • Mikayel Samvelyan, Akbir Khan, Michael Dennis, Minqi Jiang, Jack Parker-Holder, Jakob Foerster, Roberta Raileanu, Tim Rocktäschel
Open-ended learning methods that automatically generate a curriculum of increasingly challenging tasks serve as a promising avenue toward generally capable reinforcement learning agents.
no code implementations • 18 Jan 2023 • Adaptive Agent Team, Jakob Bauer, Kate Baumli, Satinder Baveja, Feryal Behbahani, Avishkar Bhoopchand, Nathalie Bradley-Schmieg, Michael Chang, Natalie Clay, Adrian Collister, Vibhavari Dasagi, Lucy Gonzalez, Karol Gregor, Edward Hughes, Sheleem Kashem, Maria Loks-Thompson, Hannah Openshaw, Jack Parker-Holder, Shreya Pathak, Nicolas Perez-Nieves, Nemanja Rakicevic, Tim Rocktäschel, Yannick Schroecker, Jakub Sygnowski, Karl Tuyls, Sarah York, Alexander Zacherl, Lei Zhang
Foundation models have shown impressive adaptation and scalability in supervised and self-supervised learning problems, but so far these successes have not fully translated to reinforcement learning (RL).
no code implementations • 15 Nov 2022 • Minqi Jiang, Tim Rocktäschel, Edward Grefenstette
We are at the cusp of a transition from "learning from data" to "learning what data to learn from" as a central focus of artificial intelligence (AI) research.
1 code implementation • 1 Nov 2022 • Eric Hambro, Roberta Raileanu, Danielle Rothermel, Vegard Mella, Tim Rocktäschel, Heinrich Küttler, Naila Murray
Recent breakthroughs in the development of agents to solve challenging sequential decision making problems such as Go, StarCraft, or DOTA, have relied on both simulated environments and large-scale datasets.
1 code implementation • 26 Oct 2022 • Laura Ruis, Akbir Khan, Stella Biderman, Sara Hooker, Tim Rocktäschel, Edward Grefenstette
We present our findings as the starting point for further research into evaluating how LLMs interpret language in context and to drive the development of more pragmatic and useful models of human discourse.
no code implementations • 23 Oct 2022 • Yingchen Xu, Jack Parker-Holder, Aldo Pacchiano, Philip J. Ball, Oleh Rybkin, Stephen J. Roberts, Tim Rocktäschel, Edward Grefenstette
We then present CASCADE, a novel approach for self-supervised exploration in this new setting.
2 code implementations • 11 Oct 2022 • Mikael Henaff, Roberta Raileanu, Minqi Jiang, Tim Rocktäschel
In recent years, a number of reinforcement learning (RL) methods have been proposed to explore complex environments which differ across episodes.
1 code implementation • 30 Sep 2022 • Victor Zhong, Jesse Mu, Luke Zettlemoyer, Edward Grefenstette, Tim Rocktäschel
Recent work has shown that augmenting environments with language descriptions improves policy learning.
1 code implementation • 22 Aug 2022 • Zhengyao Jiang, Tianjun Zhang, Michael Janner, Yueying Li, Tim Rocktäschel, Edward Grefenstette, Yuandong Tian
Planning-based reinforcement learning has shown strong performance in tasks in discrete and low-dimensional continuous action spaces.
2 code implementations • 23 Jul 2022 • Michael Matthews, Mikayel Samvelyan, Jack Parker-Holder, Edward Grefenstette, Tim Rocktäschel
In this paper, we investigate how skills can be incorporated into the training of reinforcement learning (RL) agents in complex environments with large state-action spaces and sparse rewards.
no code implementations • 13 Jul 2022 • Christopher Bamford, Minqi Jiang, Mikayel Samvelyan, Tim Rocktäschel
Progress in reinforcement learning (RL) research is often driven by the design of new, challenging environments -- a costly undertaking requiring skills orthogonal to that of a typical machine learning researcher.
1 code implementation • 11 Jul 2022 • Minqi Jiang, Michael Dennis, Jack Parker-Holder, Andrei Lupu, Heinrich Küttler, Edward Grefenstette, Tim Rocktäschel, Jakob Foerster
Problematically, in partially-observable or stochastic settings, optimal policies may depend on the ground-truth distribution over aleatoric parameters of the environment in the intended deployment setting, while curriculum learning necessarily shifts the training distribution.
1 code implementation • 31 May 2022 • Zhengyao Jiang, Tianjun Zhang, Robert Kirk, Tim Rocktäschel, Edward Grefenstette
In this paper, we treat the transition data of the MDP as a graph, and define a novel backup operator, Graph Backup, which exploits this graph structure for better value estimation.
no code implementations • 22 Mar 2022 • Eric Hambro, Sharada Mohanty, Dmitrii Babaev, Minwoo Byeon, Dipam Chakraborty, Edward Grefenstette, Minqi Jiang, DaeJin Jo, Anssi Kanervisto, Jongmin Kim, Sungwoong Kim, Robert Kirk, Vitaly Kurin, Heinrich Küttler, Taehwon Kwon, Donghoon Lee, Vegard Mella, Nantas Nardelli, Ivan Nazarov, Nikita Ovsov, Jack Parker-Holder, Roberta Raileanu, Karolis Ramanauskas, Tim Rocktäschel, Danielle Rothermel, Mikayel Samvelyan, Dmitry Sorokin, Maciej Sypetkowski, Michał Sypetkowski
In this report, we summarize the takeaways from the first NeurIPS 2021 NetHack Challenge.
1 code implementation • 2 Mar 2022 • Jack Parker-Holder, Minqi Jiang, Michael Dennis, Mikayel Samvelyan, Jakob Foerster, Edward Grefenstette, Tim Rocktäschel
Our approach, which we call Adversarially Compounding Complexity by Editing Levels (ACCEL), seeks to constantly produce levels at the frontier of an agent's capabilities, resulting in curricula that start simple but become increasingly complex.
1 code implementation • 17 Feb 2022 • Jesse Mu, Victor Zhong, Roberta Raileanu, Minqi Jiang, Noah Goodman, Tim Rocktäschel, Edward Grefenstette
Reinforcement learning (RL) agents are particularly hard to train when rewards are sparse.
no code implementations • 31 Jan 2022 • Anuj Mahajan, Mikayel Samvelyan, Tarun Gupta, Benjamin Ellis, Mingfei Sun, Tim Rocktäschel, Shimon Whiteson
Specifically, we study generalization bounds under a linear dependence of the underlying dynamics on the agent capabilities, which can be seen as a generalization of Successor Features to MAS.
no code implementations • 18 Nov 2021 • Robert Kirk, Amy Zhang, Edward Grefenstette, Tim Rocktäschel
This survey is an overview of this nascent field.
2 code implementations • NeurIPS 2021 • Minqi Jiang, Michael Dennis, Jack Parker-Holder, Jakob Foerster, Edward Grefenstette, Tim Rocktäschel
Furthermore, our theory suggests a highly counterintuitive improvement to PLR: by stopping the agent from updating its policy on uncurated levels (training on less data), we can improve the convergence to Nash equilibria.
no code implementations • 29 Sep 2021 • Jack Parker-Holder, Minqi Jiang, Michael D Dennis, Mikayel Samvelyan, Jakob Nicolaus Foerster, Edward Grefenstette, Tim Rocktäschel
Deep Reinforcement Learning (RL) has recently produced impressive results in a series of settings such as games and robotics.
1 code implementation • 27 Sep 2021 • Mikayel Samvelyan, Robert Kirk, Vitaly Kurin, Jack Parker-Holder, Minqi Jiang, Eric Hambro, Fabio Petroni, Heinrich Küttler, Edward Grefenstette, Tim Rocktäschel
By leveraging the full set of entities and environment dynamics from NetHack, one of the richest grid-based video games, MiniHack allows designing custom RL testbeds that are fast and convenient to use.
no code implementations • NeurIPS Workshop ICBINB 2021 • Iryna Korshunova, Minqi Jiang, Jack Parker-Holder, Tim Rocktäschel, Edward Grefenstette
Prioritized Level Replay (PLR) has been shown to induce adaptive curricula that improve the sample-efficiency and generalization of reinforcement learning policies in environments featuring multiple tasks or levels.
no code implementations • 26 Jul 2021 • Danielle Rothermel, Margaret Li, Tim Rocktäschel, Jakob Foerster
After carefully redesigning the empirical setup, we find that when tuning learning rates properly, pretrained transformers do outperform or match training from scratch in all of our tasks, but only as long as the entire model is finetuned.
3 code implementations • 8 Oct 2020 • Minqi Jiang, Edward Grefenstette, Tim Rocktäschel
Environments with procedurally generated content serve as important benchmarks for testing systematic generalization in deep reinforcement learning.
1 code implementation • ICLR 2021 • Vitaly Kurin, Maximilian Igl, Tim Rocktäschel, Wendelin Boehmer, Shimon Whiteson
They also allow practitioners to inject biases encoded in the structure of the input graph.
no code implementations • NAACL 2021 • Prithviraj Ammanabrolu, Jack Urbanek, Margaret Li, Arthur Szlam, Tim Rocktäschel, Jason Weston
We seek to create agents that both act and communicate with other agents in pursuit of a goal.
3 code implementations • NAACL 2021 • Fabio Petroni, Aleksandra Piktus, Angela Fan, Patrick Lewis, Majid Yazdani, Nicola De Cao, James Thorne, Yacine Jernite, Vladimir Karpukhin, Jean Maillard, Vassilis Plachouras, Tim Rocktäschel, Sebastian Riedel
We test both task-specific and general baselines, evaluating downstream performance in addition to the ability of the models to provide provenance.
Ranked #3 on
Entity Linking
on KILT: WNED-CWEB
1 code implementation • ICML Workshop LaReL 2020 • Minqi Jiang, Jelena Luketina, Nantas Nardelli, Pasquale Minervini, Philip H. S. Torr, Shimon Whiteson, Tim Rocktäschel
This is partly due to the lack of lightweight simulation environments that sufficiently reflect the semantics of the real world and provide knowledge sources grounded with respect to observations in an RL environment.
2 code implementations • ICML 2020 • Pasquale Minervini, Sebastian Riedel, Pontus Stenetorp, Edward Grefenstette, Tim Rocktäschel
Attempts to render deep learning models interpretable, data-efficient, and robust have seen some success through hybridisation with rule-based systems, for example, in Neural Theorem Provers (NTPs).
Ranked #1 on
Relational Reasoning
on CLUTRR (k=3)
3 code implementations • NeurIPS 2020 • Heinrich Küttler, Nantas Nardelli, Alexander H. Miller, Roberta Raileanu, Marco Selvatici, Edward Grefenstette, Tim Rocktäschel
Here, we present the NetHack Learning Environment (NLE), a scalable, procedurally generated, stochastic, rich, and challenging environment for RL research based on the popular single-player terminal-based roguelike game, NetHack.
Ranked #1 on
NetHack Score
on NetHack Learning Environment
5 code implementations • ICLR 2021 • Andres Campero, Roberta Raileanu, Heinrich Küttler, Joshua B. Tenenbaum, Tim Rocktäschel, Edward Grefenstette
A key challenge for reinforcement learning (RL) consists of learning in environments with sparse extrinsic rewards.
5 code implementations • NeurIPS 2020 • Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, Douwe Kiela
Large pre-trained language models have been shown to store factual knowledge in their parameters, and achieve state-of-the-art results when fine-tuned on downstream NLP tasks.
Ranked #4 on
Question Answering
on WebQuestions
no code implementations • AKBC 2020 • Fabio Petroni, Patrick Lewis, Aleksandra Piktus, Tim Rocktäschel, Yuxiang Wu, Alexander H. Miller, Sebastian Riedel
When pre-trained on large unsupervised textual corpora, language models are able to store and retrieve factual knowledge to some extent, making it possible to use them directly for zero-shot cloze-style question answering.
no code implementations • ICLR 2020 • Victor Zhong, Tim Rocktäschel, Edward Grefenstette
In this work, we demonstrate that language understanding via a reading policy learner is a promising vehicle for generalisation to new environments.
1 code implementation • EMNLP 2020 • Joe Stacey, Pasquale Minervini, Haim Dubossarsky, Sebastian Riedel, Tim Rocktäschel
Natural Language Inference (NLI) datasets contain annotation artefacts resulting in spurious correlations between the natural language utterances and their respective entailment classes.
2 code implementations • ICLR 2020 • Roberta Raileanu, Tim Rocktäschel
However, we show that existing methods fall short in procedurally-generated environments where an agent is unlikely to visit a state more than once.
3 code implementations • 17 Dec 2019 • Pasquale Minervini, Matko Bošnjak, Tim Rocktäschel, Sebastian Riedel, Edward Grefenstette
Reasoning with knowledge expressed in natural language and Knowledge Bases (KBs) is a major challenge for Artificial Intelligence, with applications in machine reading, dialogue, and question answering.
Ranked #3 on
Link Prediction
on FB122
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Luca Massarelli, Fabio Petroni, Aleksandra Piktus, Myle Ott, Tim Rocktäschel, Vassilis Plachouras, Fabrizio Silvestri, Sebastian Riedel
A generated sentence is verifiable if it can be corroborated or disproved by Wikipedia, and we find that the verifiability of generated text strongly depends on the decoding strategy.
1 code implementation • 18 Oct 2019 • Victor Zhong, Tim Rocktäschel, Edward Grefenstette
In this work, we demonstrate that language understanding via a reading policy learner is a promising vehicle for generalisation to new environments.
1 code implementation • 9 Oct 2019 • Viswanath Sivakumar, Olivier Delalleau, Tim Rocktäschel, Alexander H. Miller, Heinrich Küttler, Nantas Nardelli, Mike Rabbat, Joelle Pineau, Sebastian Riedel
This is largely an artifact of building on top of frameworks designed for RL in games (e. g. OpenAI Gym).
3 code implementations • 8 Oct 2019 • Heinrich Küttler, Nantas Nardelli, Thibaut Lavril, Marco Selvatici, Viswanath Sivakumar, Tim Rocktäschel, Edward Grefenstette
TorchBeast is a platform for reinforcement learning (RL) research in PyTorch.
1 code implementation • IJCNLP 2019 • Fabio Petroni, Tim Rocktäschel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, Alexander H. Miller, Sebastian Riedel
Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks.
1 code implementation • ACL 2019 • Leon Weber, Pasquale Minervini, Jannes Münchmeyer, Ulf Leser, Tim Rocktäschel
In contrast, neural models can cope very well with ambiguity by learning distributed representations of words and their composition from data, but lead to models that are difficult to interpret.
no code implementations • 10 Jun 2019 • Jelena Luketina, Nantas Nardelli, Gregory Farquhar, Jakob Foerster, Jacob Andreas, Edward Grefenstette, Shimon Whiteson, Tim Rocktäschel
To be successful in real-world tasks, Reinforcement Learning (RL) needs to exploit the compositional, relational, and hierarchical structure of the world, and learn to transfer it to the task at hand.
no code implementations • ICLR 2019 • Pasquale Minervini, Matko Bosnjak, Tim Rocktäschel, Edward Grefenstette, Sebastian Riedel
Reasoning over text and Knowledge Bases (KBs) is a major challenge for Artificial Intelligence, with applications in machine reading, dialogue, and question answering.
no code implementations • ICLR 2019 • Leon Weber, Pasquale Minervini, Ulf Leser, Tim Rocktäschel
Currently, most work in natural language processing focuses on neural networks which learn distributed representations of words and their composition, thereby performing well in the presence of large linguistic variability.
1 code implementation • IJCNLP 2019 • Jack Urbanek, Angela Fan, Siddharth Karamcheti, Saachi Jain, Samuel Humeau, Emily Dinan, Tim Rocktäschel, Douwe Kiela, Arthur Szlam, Jason Weston
We analyze the ingredients necessary for successful grounding in this setting, and how each of these factors relate to agents that can talk and act successfully.
2 code implementations • NeurIPS 2018 • Oana-Maria Camburu, Tim Rocktäschel, Thomas Lukasiewicz, Phil Blunsom
In order for machine learning to garner widespread public adoption, models must be able to provide interpretable and robust explanations for their decisions, as well as learn from human-provided explanations at train time.
Ranked #1 on
Natural Language Inference
on e-SNLI
no code implementations • ICLR 2019 • Alistair Letcher, Jakob Foerster, David Balduzzi, Tim Rocktäschel, Shimon Whiteson
A growing number of learning methods are actually differentiable games whose players optimise multiple, interdependent objectives in parallel -- from GANs and intrinsic curiosity to multi-agent RL.
no code implementations • 27 Sep 2018 • Jingkai Mao, Jakob Foerster, Tim Rocktäschel, Gregory Farquhar, Maruan Al-Shedivat, Shimon Whiteson
To improve the sample efficiency of DiCE, we propose a new baseline term for higher order gradient estimation.
no code implementations • EMNLP 2018 • Marzieh Saeidi, Max Bartolo, Patrick Lewis, Sameer Singh, Tim Rocktäschel, Mike Sheldon, Guillaume Bouchard, Sebastian Riedel
This task requires both the interpretation of rules and the application of background knowledge.
no code implementations • 21 Jul 2018 • Pasquale Minervini, Matko Bosnjak, Tim Rocktäschel, Sebastian Riedel
Neural models combining representation learning and reasoning in an end-to-end trainable manner are receiving increasing interest.
1 code implementation • ICML 2018 • Jakob Foerster, Gregory Farquhar, Maruan Al-Shedivat, Tim Rocktäschel, Eric Xing, Shimon Whiteson
Lastly, to match the first-order gradient under differentiation, SL treats part of the cost as a fixed sample, which we show leads to missing and wrong terms for estimators of higher-order derivatives.
2 code implementations • 20 Jun 2018 • Dirk Weissenborn, Pasquale Minervini, Tim Dettmers, Isabelle Augenstein, Johannes Welbl, Tim Rocktäschel, Matko Bošnjak, Jeff Mitchell, Thomas Demeester, Pontus Stenetorp, Sebastian Riedel
For example, in Question Answering, the supporting text can be newswire or Wikipedia articles; in Natural Language Inference, premises can be seen as the supporting text and hypotheses as questions.
5 code implementations • 14 Feb 2018 • Jakob Foerster, Gregory Farquhar, Maruan Al-Shedivat, Tim Rocktäschel, Eric P. Xing, Shimon Whiteson
Lastly, to match the first-order gradient under differentiation, SL treats part of the cost as a fixed sample, which we show leads to missing and wrong terms for estimators of higher-order derivatives.
no code implementations • 27 Dec 2017 • Tim Rocktäschel
The current state-of-the-art in many natural language processing and automated knowledge base completion tasks is held by representation learning methods which learn distributed vector representations of symbols via gradient-based optimization.
1 code implementation • ICLR 2018 • Gregory Farquhar, Tim Rocktäschel, Maximilian Igl, Shimon Whiteson
To address these challenges, we propose TreeQN, a differentiable, recursive, tree-structured model that serves as a drop-in replacement for any value function network in deep RL with discrete actions.
1 code implementation • 24 Jul 2017 • Pasquale Minervini, Thomas Demeester, Tim Rocktäschel, Sebastian Riedel
The training objective is defined as a minimax problem, where an adversary finds the most offending adversarial examples by maximising the inconsistency loss, and the model is trained by jointly minimising a supervised loss and the inconsistency loss on the adversarial examples.
3 code implementations • NeurIPS 2017 • Tim Rocktäschel, Sebastian Riedel
We introduce neural networks for end-to-end differentiable proving of queries to knowledge bases by operating on dense vector representations of symbols.
no code implementations • 15 Feb 2017 • Michał Daniluk, Tim Rocktäschel, Johannes Welbl, Sebastian Riedel
This vector is used both for predicting the next token as well as for the key and value of a differentiable memory of a token history.
5 code implementations • 24 Nov 2016 • Avishkar Bhoopchand, Tim Rocktäschel, Earl Barr, Sebastian Riedel
By augmenting a neural language model with a pointer network specialized in referring to predefined classes of identifiers, we obtain a much lower perplexity and a 5 percentage points increase in accuracy for code suggestion compared to an LSTM baseline.
7 code implementations • WS 2016 • Ben Eisner, Tim Rocktäschel, Isabelle Augenstein, Matko Bošnjak, Sebastian Riedel
Many current natural language processing applications for social media rely on representation learning and utilize pre-trained word embeddings.
no code implementations • EMNLP 2016 • Thomas Demeester, Tim Rocktäschel, Sebastian Riedel
Methods based on representation learning currently hold the state-of-the-art in many natural language processing and knowledge base inference tasks.
1 code implementation • EMNLP 2016 • Isabelle Augenstein, Tim Rocktäschel, Andreas Vlachos, Kalina Bontcheva
Stance detection is the task of classifying the attitude expressed in a text towards a target such as Hillary Clinton to be "positive", negative" or "neutral".
no code implementations • 9 Jun 2016 • Dirk Weissenborn, Tim Rocktäschel
Recurrent neural networks such as the GRU and LSTM found wide adoption in natural language processing and achieve state-of-the-art results for many tasks.
no code implementations • 4 Jun 2016 • Vladyslav Kolesnyk, Tim Rocktäschel, Sebastian Riedel
We take entailment-pairs of the Stanford Natural Language Inference corpus and train an LSTM with attention.
1 code implementation • ICML 2017 • Matko Bošnjak, Tim Rocktäschel, Jason Naradowsky, Sebastian Riedel
Given that in practice training data is scarce for all but a small set of problems, a core question is how to incorporate prior knowledge into a model.
7 code implementations • 22 Sep 2015 • Tim Rocktäschel, Edward Grefenstette, Karl Moritz Hermann, Tomáš Kočiský, Phil Blunsom
We extend this model with a word-by-word neural attention mechanism that encourages reasoning over entailments of pairs of words and phrases.
Ranked #84 on
Natural Language Inference
on SNLI