Search Results for author: Edward Grefenstette

Found 68 papers, 36 papers with code

Leading the Pack: N-player Opponent Shaping

no code implementations19 Dec 2023 Alexandra Souly, Timon Willi, Akbir Khan, Robert Kirk, Chris Lu, Edward Grefenstette, Tim Rocktäschel

We evaluate on over 4 different environments, varying the number of players from 3 to 5, and demonstrate that model-based OS methods converge to equilibrium with better global welfare than naive learning.

Scaling Opponent Shaping to High Dimensional Games

no code implementations19 Dec 2023 Akbir Khan, Timon Willi, Newton Kwan, Andrea Tacchetti, Chris Lu, Edward Grefenstette, Tim Rocktäschel, Jakob Foerster

In multi-agent settings with mixed incentives, methods developed for zero-sum games have been shown to lead to detrimental outcomes.

Meta-Learning

H-GAP: Humanoid Control with a Generalist Planner

no code implementations5 Dec 2023 Zhengyao Jiang, Yingchen Xu, Nolan Wagener, Yicheng Luo, Michael Janner, Edward Grefenstette, Tim Rocktäschel, Yuandong Tian

However, the extensive collection of human motion-captured data and the derived datasets of humanoid trajectories, such as MoCapAct, paves the way to tackle these challenges.

Humanoid Control Model Predictive Control +1

minimax: Efficient Baselines for Autocurricula in JAX

1 code implementation21 Nov 2023 Minqi Jiang, Michael Dennis, Edward Grefenstette, Tim Rocktäschel

This compute requirement is a major obstacle to rapid innovation for the field.

Decision Making

Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks

no code implementations21 Nov 2023 Samyak Jain, Robert Kirk, Ekdeep Singh Lubana, Robert P. Dick, Hidenori Tanaka, Edward Grefenstette, Tim Rocktäschel, David Scott Krueger

Fine-tuning large pre-trained models has become the de facto strategy for developing both task-specific and general-purpose machine learning systems, including developing models that are safe to deploy.

Network Pruning

Understanding the Effects of RLHF on LLM Generalisation and Diversity

1 code implementation10 Oct 2023 Robert Kirk, Ishita Mediratta, Christoforos Nalmpantis, Jelena Luketina, Eric Hambro, Edward Grefenstette, Roberta Raileanu

OOD generalisation is crucial given the wide range of real-world scenarios in which these models are being used, while output diversity refers to the model's ability to generate varied outputs and is important for a variety of use cases.

Instruction Following

Optimal Transport for Offline Imitation Learning

1 code implementation24 Mar 2023 Yicheng Luo, Zhengyao Jiang, samuel cohen, Edward Grefenstette, Marc Peter Deisenroth

In this paper, we introduce Optimal Transport Reward labeling (OTR), an algorithm that assigns rewards to offline trajectories, with a few high-quality demonstrations.

D4RL Imitation Learning +2

General Intelligence Requires Rethinking Exploration

no code implementations15 Nov 2022 Minqi Jiang, Tim Rocktäschel, Edward Grefenstette

We are at the cusp of a transition from "learning from data" to "learning what data to learn from" as a central focus of artificial intelligence (AI) research.

reinforcement-learning Reinforcement Learning (RL)

The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs

1 code implementation NeurIPS 2023 Laura Ruis, Akbir Khan, Stella Biderman, Sara Hooker, Tim Rocktäschel, Edward Grefenstette

We present our findings as the starting point for further research into evaluating how LLMs interpret language in context and to drive the development of more pragmatic and useful models of human discourse.

Efficient Planning in a Compact Latent Action Space

1 code implementation22 Aug 2022 Zhengyao Jiang, Tianjun Zhang, Michael Janner, Yueying Li, Tim Rocktäschel, Edward Grefenstette, Yuandong Tian

Planning-based reinforcement learning has shown strong performance in tasks in discrete and low-dimensional continuous action spaces.

Continuous Control Decision Making +1

Hierarchical Kickstarting for Skill Transfer in Reinforcement Learning

2 code implementations23 Jul 2022 Michael Matthews, Mikayel Samvelyan, Jack Parker-Holder, Edward Grefenstette, Tim Rocktäschel

In this paper, we investigate how skills can be incorporated into the training of reinforcement learning (RL) agents in complex environments with large state-action spaces and sparse rewards.

Inductive Bias NetHack +2

Grounding Aleatoric Uncertainty for Unsupervised Environment Design

1 code implementation11 Jul 2022 Minqi Jiang, Michael Dennis, Jack Parker-Holder, Andrei Lupu, Heinrich Küttler, Edward Grefenstette, Tim Rocktäschel, Jakob Foerster

Problematically, in partially-observable or stochastic settings, optimal policies may depend on the ground-truth distribution over aleatoric parameters of the environment in the intended deployment setting, while curriculum learning necessarily shifts the training distribution.

Reinforcement Learning (RL)

Graph Backup: Data Efficient Backup Exploiting Markovian Transitions

1 code implementation31 May 2022 Zhengyao Jiang, Tianjun Zhang, Robert Kirk, Tim Rocktäschel, Edward Grefenstette

In this paper, we treat the transition data of the MDP as a graph, and define a novel backup operator, Graph Backup, which exploits this graph structure for better value estimation.

Atari Games counterfactual +2

Evolving Curricula with Regret-Based Environment Design

3 code implementations2 Mar 2022 Jack Parker-Holder, Minqi Jiang, Michael Dennis, Mikayel Samvelyan, Jakob Foerster, Edward Grefenstette, Tim Rocktäschel

Our approach, which we call Adversarially Compounding Complexity by Editing Levels (ACCEL), seeks to constantly produce levels at the frontier of an agent's capabilities, resulting in curricula that start simple but become increasingly complex.

Reinforcement Learning (RL)

Replay-Guided Adversarial Environment Design

4 code implementations NeurIPS 2021 Minqi Jiang, Michael Dennis, Jack Parker-Holder, Jakob Foerster, Edward Grefenstette, Tim Rocktäschel

Furthermore, our theory suggests a highly counterintuitive improvement to PLR: by stopping the agent from updating its policy on uncurated levels (training on less data), we can improve the convergence to Nash equilibria.

Reinforcement Learning (RL)

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

1 code implementation27 Sep 2021 Mikayel Samvelyan, Robert Kirk, Vitaly Kurin, Jack Parker-Holder, Minqi Jiang, Eric Hambro, Fabio Petroni, Heinrich Küttler, Edward Grefenstette, Tim Rocktäschel

By leveraging the full set of entities and environment dynamics from NetHack, one of the richest grid-based video games, MiniHack allows designing custom RL testbeds that are fast and convenient to use.

NetHack reinforcement-learning +2

Return Dispersion as an Estimator of Learning Potential for Prioritized Level Replay

no code implementations NeurIPS Workshop ICBINB 2021 Iryna Korshunova, Minqi Jiang, Jack Parker-Holder, Tim Rocktäschel, Edward Grefenstette

Prioritized Level Replay (PLR) has been shown to induce adaptive curricula that improve the sample-efficiency and generalization of reinforcement learning policies in environments featuring multiple tasks or levels.

reinforcement-learning Reinforcement Learning (RL)

Prioritized Level Replay

4 code implementations8 Oct 2020 Minqi Jiang, Edward Grefenstette, Tim Rocktäschel

Environments with procedurally generated content serve as important benchmarks for testing systematic generalization in deep reinforcement learning.

Systematic Generalization

Learning Reasoning Strategies in End-to-End Differentiable Proving

2 code implementations ICML 2020 Pasquale Minervini, Sebastian Riedel, Pontus Stenetorp, Edward Grefenstette, Tim Rocktäschel

Attempts to render deep learning models interpretable, data-efficient, and robust have seen some success through hybridisation with rule-based systems, for example, in Neural Theorem Provers (NTPs).

Link Prediction Relational Reasoning

The NetHack Learning Environment

3 code implementations NeurIPS 2020 Heinrich Küttler, Nantas Nardelli, Alexander H. Miller, Roberta Raileanu, Marco Selvatici, Edward Grefenstette, Tim Rocktäschel

Here, we present the NetHack Learning Environment (NLE), a scalable, procedurally generated, stochastic, rich, and challenging environment for RL research based on the popular single-player terminal-based roguelike game, NetHack.

NetHack Score Reinforcement Learning (RL) +1

RTFM: Generalising to New Environment Dynamics via Reading

no code implementations ICLR 2020 Victor Zhong, Tim Rocktäschel, Edward Grefenstette

In this work, we demonstrate that language understanding via a reading policy learner is a promising vehicle for generalisation to new environments.

Differentiable Reasoning on Large Knowledge Bases and Natural Language

3 code implementations17 Dec 2019 Pasquale Minervini, Matko Bošnjak, Tim Rocktäschel, Sebastian Riedel, Edward Grefenstette

Reasoning with knowledge expressed in natural language and Knowledge Bases (KBs) is a major challenge for Artificial Intelligence, with applications in machine reading, dialogue, and question answering.

Link Prediction Question Answering +1

RTFM: Generalising to Novel Environment Dynamics via Reading

1 code implementation18 Oct 2019 Victor Zhong, Tim Rocktäschel, Edward Grefenstette

In this work, we demonstrate that language understanding via a reading policy learner is a promising vehicle for generalisation to new environments.

Generalized Inner Loop Meta-Learning

3 code implementations3 Oct 2019 Edward Grefenstette, Brandon Amos, Denis Yarats, Phu Mon Htut, Artem Molchanov, Franziska Meier, Douwe Kiela, Kyunghyun Cho, Soumith Chintala

Many (but not all) approaches self-qualifying as "meta-learning" in deep learning and reinforcement learning fit a common pattern of approximating the solution to a nested optimization problem.

Meta-Learning reinforcement-learning +1

Meta Learning via Learned Loss

no code implementations25 Sep 2019 Sarah Bechtle, Artem Molchanov, Yevgen Chebotar, Edward Grefenstette, Ludovic Righetti, Gaurav Sukhatme, Franziska Meier

We present a meta-learning method for learning parametric loss functions that can generalize across different tasks and model architectures.

Meta-Learning reinforcement-learning +1

Meta-Learning via Learned Loss

1 code implementation12 Jun 2019 Sarah Bechtle, Artem Molchanov, Yevgen Chebotar, Edward Grefenstette, Ludovic Righetti, Gaurav Sukhatme, Franziska Meier

This information shapes the learned loss function such that the environment does not need to provide this information during meta-test time.

Meta-Learning

A Survey of Reinforcement Learning Informed by Natural Language

no code implementations10 Jun 2019 Jelena Luketina, Nantas Nardelli, Gregory Farquhar, Jakob Foerster, Jacob Andreas, Edward Grefenstette, Shimon Whiteson, Tim Rocktäschel

To be successful in real-world tasks, Reinforcement Learning (RL) needs to exploit the compositional, relational, and hierarchical structure of the world, and learn to transfer it to the task at hand.

Decision Making Instruction Following +5

Scalable Neural Theorem Proving on Knowledge Bases and Natural Language

no code implementations ICLR 2019 Pasquale Minervini, Matko Bosnjak, Tim Rocktäschel, Edward Grefenstette, Sebastian Riedel

Reasoning over text and Knowledge Bases (KBs) is a major challenge for Artificial Intelligence, with applications in machine reading, dialogue, and question answering.

Automated Theorem Proving Link Prediction +2

Analysing Mathematical Reasoning Abilities of Neural Models

7 code implementations ICLR 2019 David Saxton, Edward Grefenstette, Felix Hill, Pushmeet Kohli

The structured nature of the mathematics domain, covering arithmetic, algebra, probability and calculus, enables the construction of training and test splits designed to clearly illuminate the capabilities and failure-modes of different architectures, as well as evaluate their ability to compose and relate knowledge and learned processes.

Math Word Problem Solving

CompILE: Compositional Imitation Learning and Execution

3 code implementations4 Dec 2018 Thomas Kipf, Yujia Li, Hanjun Dai, Vinicius Zambaldi, Alvaro Sanchez-Gonzalez, Edward Grefenstette, Pushmeet Kohli, Peter Battaglia

We introduce Compositional Imitation Learning and Execution (CompILE): a framework for learning reusable, variable-length segments of hierarchically-structured behavior from demonstration data.

Continuous Control Imitation Learning

Strength in Numbers: Trading-off Robustness and Computation via Adversarially-Trained Ensembles

no code implementations ICLR 2019 Edward Grefenstette, Robert Stanforth, Brendan O'Donoghue, Jonathan Uesato, Grzegorz Swirszcz, Pushmeet Kohli

We show that increasing the number of parameters in adversarially-trained models increases their robustness, and in particular that ensembling smaller models while adversarially training the entire ensemble as a single model is a more efficient way of spending said budget than simply using a larger single model.

Self-Driving Cars

Learning to Understand Goal Specifications by Modelling Reward

1 code implementation ICLR 2019 Dzmitry Bahdanau, Felix Hill, Jan Leike, Edward Hughes, Arian Hosseini, Pushmeet Kohli, Edward Grefenstette

Recent work has shown that deep reinforcement-learning agents can learn to follow language-like instructions from infrequent environment rewards.

Can Neural Networks Understand Logical Entailment?

no code implementations ICLR 2018 Richard Evans, David Saxton, David Amos, Pushmeet Kohli, Edward Grefenstette

We introduce a new dataset of logical entailments for the purpose of measuring models' ability to capture and exploit the structure of logical expressions against an entailment prediction task.

Inductive Bias

The NarrativeQA Reading Comprehension Challenge

2 code implementations TACL 2018 Tomáš Kočiský, Jonathan Schwarz, Phil Blunsom, Chris Dyer, Karl Moritz Hermann, Gábor Melis, Edward Grefenstette

Reading comprehension (RC)---in contrast to information retrieval---requires integrating information and reasoning about events, entities, and their relations across a full document.

Ranked #9 on Question Answering on NarrativeQA (BLEU-1 metric)

Information Retrieval Question Answering +2

Learning Explanatory Rules from Noisy Data

3 code implementations13 Nov 2017 Richard Evans, Edward Grefenstette

Artificial Neural Networks are powerful function approximators capable of modelling solutions to a wide variety of problems, both supervised and unsupervised.

Inductive logic programming

Deep Learning for Semantic Composition

no code implementations ACL 2017 Xiaodan Zhu, Edward Grefenstette

Learning representation to model the meaning of text has been a core problem in NLP.

Semantic Composition

Learning to Compose Words into Sentences with Reinforcement Learning

no code implementations28 Nov 2016 Dani Yogatama, Phil Blunsom, Chris Dyer, Edward Grefenstette, Wang Ling

We use reinforcement learning to learn tree-structured neural networks for computing representations of natural language sentences.

reinforcement-learning Reinforcement Learning (RL)

The Neural Noisy Channel

no code implementations8 Nov 2016 Lei Yu, Phil Blunsom, Chris Dyer, Edward Grefenstette, Tomas Kocisky

We formulate sequence to sequence transduction as a noisy channel decoding problem and use recurrent neural networks to parameterise the source and channel models.

Machine Translation Morphological Inflection +2

Reasoning about Entailment with Neural Attention

7 code implementations22 Sep 2015 Tim Rocktäschel, Edward Grefenstette, Karl Moritz Hermann, Tomáš Kočiský, Phil Blunsom

We extend this model with a word-by-word neural attention mechanism that encourages reasoning over entailments of pairs of words and phrases.

Natural Language Inference

Learning to Transduce with Unbounded Memory

4 code implementations NeurIPS 2015 Edward Grefenstette, Karl Moritz Hermann, Mustafa Suleyman, Phil Blunsom

Recently, strong results have been demonstrated by Deep Recurrent Neural Networks on natural language transduction problems.

Natural Language Transduction Translation

Investigating the Role of Prior Disambiguation in Deep-learning Compositional Models of Meaning

no code implementations15 Nov 2014 Jianpeng Cheng, Dimitri Kartsaklis, Edward Grefenstette

This paper aims to explore the effect of prior disambiguation on neural network- based compositional models, with the hope that better semantic representations for text compounds can be produced.

A Deep Architecture for Semantic Parsing

no code implementations WS 2014 Edward Grefenstette, Phil Blunsom, Nando de Freitas, Karl Moritz Hermann

Many successful approaches to semantic parsing build on top of the syntactic analysis of text, and make use of distributional representations or statistical models to match parses to ontology-specific queries.

Semantic Parsing

Category-Theoretic Quantitative Compositional Distributional Models of Natural Language Semantics

no code implementations6 Nov 2013 Edward Grefenstette

This thesis focuses on a particular approach to this compositionality problem, namely using the categorical framework developed by Coecke, Sadrzadeh, and Clark, which combines syntactic analysis formalisms with distributional semantic representations of meaning to produce syntactically motivated composition operations.

"Not not bad" is not "bad": A distributional account of negation

no code implementations10 Jun 2013 Karl Moritz Hermann, Edward Grefenstette, Phil Blunsom

With the increasing empirical success of distributional models of compositional semantics, it is timely to consider the types of textual logic that such models are capable of capturing.

Negation

A quantum teleportation inspired algorithm produces sentence meaning from word meaning and grammatical structure

no code implementations2 May 2013 Stephen Clark, Bob Coecke, Edward Grefenstette, Stephen Pulman, Mehrnoosh Sadrzadeh

We discuss an algorithm which produces the meaning of a sentence given meanings of its words, and its resemblance to quantum teleportation.

Sentence

Towards a Formal Distributional Semantics: Simulating Logical Calculi with Tensors

no code implementations SEMEVAL 2013 Edward Grefenstette

This paper seeks to bring this reconciliation one step further by showing how the mathematical constructs commonly used in compositional distributional models, such as tensors and matrices, can be used to simulate different aspects of predicate logic.

Relation

Experimental Support for a Categorical Compositional Distributional Model of Meaning

1 code implementation20 Jun 2011 Edward Grefenstette, Mehrnoosh Sadrzadeh

The evaluation is based on the word disambiguation task developed by Mitchell and Lapata (2008) for intransitive sentences, and on a similar new experiment designed for transitive sentences.

Cannot find the paper you are looking for? You can Submit a new open access paper.