Search Results for author: Felix Hill

Found 61 papers, 26 papers with code

Paper
Add Code

Concreteness and Corpora: A Theoretical and Practical Study

no code implementations • WS 2013 • Felix Hill, Douwe Kiela, Anna Korhonen

Paper
Add Code

Multi-Modal Models for Concrete and Abstract Concept Meaning

no code implementations • TACL 2014 • Felix Hill, Roi Reichart, Anna Korhonen

Multi-modal models that learn semantic representations from both linguistic and perceptual input outperform language-only models on a range of evaluations, and better reflect human concept acquisition.

Language Acquisition

Paper
Add Code

Concreteness and Subjectivity as Dimensions of Lexical Meaning

no code implementations • ACL 2014 • Felix Hill, Anna Korhonen

Sentiment Analysis Subjectivity Analysis +1

Paper
Add Code

Improving Multi-Modal Representations Using Image Dispersion: Why Less is Sometimes More

no code implementations • ACL 2014 • Douwe Kiela, Felix Hill, Anna Korhonen, Stephen Clark

Paper
Add Code

SimLex-999: Evaluating Semantic Models with (Genuine) Similarity Estimation

3 code implementations • CL 2015 • Felix Hill, Roi Reichart, Anna Korhonen

We present SimLex-999, a gold standard resource for evaluating distributional semantic models that improves on existing resources in several important ways.

Representation Learning

Paper
Code

Learning Abstract Concept Embeddings from Multi-Modal Data: Since You Probably Can't See What I Mean

no code implementations • EMNLP 2014 • Felix Hill, Anna Korhonen

Transfer Learning

Paper
Add Code

Not All Neural Embeddings are Born Equal

no code implementations • 2 Oct 2014 • Felix Hill, Kyunghyun Cho, Sebastien Jean, Coline Devin, Yoshua Bengio

Neural language models learn word representations that capture rich linguistic and conceptual information.

Machine Translation Translation

Paper
Add Code

Embedding Word Similarity with Neural Machine Translation

no code implementations • 19 Dec 2014 • Felix Hill, Kyunghyun Cho, Sebastien Jean, Coline Devin, Yoshua Bengio

Here we investigate the embeddings learned by neural machine translation models, a recently-developed class of neural language model.

Language Modelling Machine Translation +2

Paper
Add Code

Learning to Understand Phrases by Embedding the Dictionary

2 code implementations • TACL 2016 • Felix Hill, Kyunghyun Cho, Anna Korhonen, Yoshua Bengio

Distributional models that learn rich semantic word representations are a success story of recent NLP research.

General Knowledge

106

Paper
Code

Specializing Word Embeddings for Similarity or Relatedness

no code implementations • EMNLP 2015 • Douwe Kiela, Felix Hill, Stephen Clark

Document Classification Named Entity Recognition (NER) +2

Paper
Add Code

The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations

3 code implementations • 7 Nov 2015 • Felix Hill, Antoine Bordes, Sumit Chopra, Jason Weston

We introduce a new test of how well language models capture meaning in children's books.

Language Modelling

10,425

Paper
Code

Learning Distributed Representations of Sentences from Unlabelled Data

1 code implementation • NAACL 2016 • Felix Hill, Kyunghyun Cho, Anna Korhonen

Unsupervised methods for learning distributed representations of words are ubiquitous in today's NLP research, but far less is known about the best ways to learn distributed phrase or sentence representations from unlabelled data.

Ranked #16 on Subjectivity Analysis on SUBJ

Representation Learning Sentence

Paper
Code

SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity

1 code implementation • EMNLP 2016 • Daniela Gerz, Ivan Vulić, Felix Hill, Roi Reichart, Anna Korhonen

Verbs play a critical role in the meaning of sentences, but these ubiquitous words have received little attention in recent distributional semantics research.

Representation Learning

Paper
Code

HyperLex: A Large-Scale Evaluation of Graded Lexical Entailment

no code implementations • CL 2017 • Ivan Vulić, Daniela Gerz, Douwe Kiela, Felix Hill, Anna Korhonen

We introduce HyperLex - a dataset and evaluation resource that quantifies the extent of of the semantic category membership, that is, type-of relation also known as hyponymy-hypernymy or lexical entailment (LE) relation between 2, 616 concept pairs.

Lexical Entailment Relation +1

Paper
Add Code

Grounded Language Learning in a Simulated 3D World

1 code implementation • 20 Jun 2017 • Karl Moritz Hermann, Felix Hill, Simon Green, Fumin Wang, Ryan Faulkner, Hubert Soyer, David Szepesvari, Wojciech Marian Czarnecki, Max Jaderberg, Denis Teplyashin, Marcus Wainwright, Chris Apps, Demis Hassabis, Phil Blunsom

Trained via a combination of reinforcement and unsupervised learning, and beginning with minimal prior knowledge, the agent learns to relate linguistic symbols to emergent perceptual representations of its physical surroundings and to pertinent sequences of actions.

Grounded language learning

Paper
Code

Understanding Early Word Learning in Situated Artificial Agents

no code implementations • ICLR 2018 • Felix Hill, Stephen Clark, Karl Moritz Hermann, Phil Blunsom

Neural network-based systems can now learn to locate the referents of words and phrases in images, answer questions about visual scenes, and execute symbolic instructions as first-person actors in partially-observable worlds.

Grounded language learning Policy Gradient Methods

Paper
Add Code

Understanding Grounded Language Learning Agents

no code implementations • ICLR 2018 • Felix Hill, Karl Moritz Hermann, Phil Blunsom, Stephen Clark

Neural network-based systems can now learn to locate the referents of words and phrases in images, answer questions about visual scenes, and even execute symbolic instructions as first-person actors in partially-observable worlds.

Grounded language learning Policy Gradient Methods

Paper
Add Code

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

11 code implementations • WS 2018 • Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman

For natural language understanding (NLU) technology to be maximally useful, both practically and as a scientific object of study, it must be general: it must be able to process language in a way that is not exclusively tailored to any one specific task or dataset.

Ranked #46 on Natural Language Inference on MultiNLI

Natural Language Inference Natural Language Understanding +2

2,320

Paper
Code

Learning to Understand Goal Specifications by Modelling Reward

1 code implementation • ICLR 2019 • Dzmitry Bahdanau, Felix Hill, Jan Leike, Edward Hughes, Arian Hosseini, Pushmeet Kohli, Edward Grefenstette

Recent work has shown that deep reinforcement-learning agents can learn to follow language-like instructions from infrequent environment rewards.

Paper
Code

Measuring abstract reasoning in neural networks

2 code implementations • ICML 2018 • David G. T. Barrett, Felix Hill, Adam Santoro, Ari S. Morcos, Timothy Lillicrap

To succeed at this challenge, models must cope with various generalisation `regimes' in which the training and test data differ in clearly-defined ways.

Paper
Code

Neural Arithmetic Logic Units

21 code implementations • NeurIPS 2018 • Andrew Trask, Felix Hill, Scott Reed, Jack Rae, Chris Dyer, Phil Blunsom

Neural networks can learn to represent and manipulate numerical information, but they seldom generalize well outside of the range of numerical values encountered during training.

Paper
Code

Generating Diverse Programs with Instruction Conditioned Reinforced Adversarial Learning

no code implementations • 3 Dec 2018 • Aishwarya Agrawal, Mateusz Malinowski, Felix Hill, Ali Eslami, Oriol Vinyals, tejas kulkarni

In this work, we study the setting in which an agent must learn to generate programs for diverse scenes conditioned on a given symbolic instruction.

Paper
Add Code

Learning to Make Analogies by Contrasting Abstract Relational Structure

2 code implementations • ICLR 2019 • Felix Hill, Adam Santoro, David G. T. Barrett, Ari S. Morcos, Timothy Lillicrap

Here, we study how analogical reasoning can be induced in neural networks that learn to perceive and reason about raw visual data.

166

Paper
Code

Analysing Mathematical Reasoning Abilities of Neural Models

7 code implementations • ICLR 2019 • David Saxton, Edward Grefenstette, Felix Hill, Pushmeet Kohli

The structured nature of the mathematics domain, covering arithmetic, algebra, probability and calculus, enables the construction of training and test splits designed to clearly illuminate the capabilities and failure-modes of different architectures, as well as evaluate their ability to compose and relate knowledge and learned processes.

Ranked #2 on Question Answering on Mathematics Dataset

Math Word Problem Solving

1,723

Paper
Code

Is coding a relevant metaphor for building AI? A commentary on "Is coding a relevant metaphor for the brain?", by Romain Brette

no code implementations • 18 Apr 2019 • Adam Santoro, Felix Hill, David Barrett, David Raposo, Matthew Botvinick, Timothy Lillicrap

Brette contends that the neural coding metaphor is an invalid basis for theories of what the brain does.

Paper
Add Code

SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

6 code implementations • NeurIPS 2019 • Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman

In the last year, new models and methods for pretraining and transfer learning have driven striking performance improvements across a range of language understanding tasks.

Transfer Learning

1,605

Paper
Code

Higher-order Comparisons of Sentence Encoder Representations

no code implementations • IJCNLP 2019 • Mostafa Abdou, Artur Kulmizev, Felix Hill, Daniel M. Low, Anders Søgaard

Representational Similarity Analysis (RSA) is a technique developed by neuroscientists for comparing activity patterns of different measurement modalities (e. g., fMRI, electrophysiology, behavior).

Sentence

Paper
Add Code

Robust Instruction-Following in a Situated Agent via Transfer-Learning from Text

no code implementations • 25 Sep 2019 • Felix Hill, Sona Mokra, Nathaniel Wong, Tim Harley

We address this issue by integrating language encoders that are pretrained on large text corpora into a situated, instruction-following agent.

Instruction Following Representation Learning +1

Paper
Add Code

Environmental drivers of systematicity and generalization in a situated agent

no code implementations • ICLR 2020 • Felix Hill, Andrew Lampinen, Rosalia Schneider, Stephen Clark, Matthew Botvinick, James L. McClelland, Adam Santoro

The question of whether deep neural networks are good at generalising beyond their immediate training experience is of critical importance for learning-based approaches to AI.

Unity

Paper
Add Code

Extending Machine Language Models toward Human-Level Language Understanding

no code implementations • 12 Dec 2019 • James L. McClelland, Felix Hill, Maja Rudolph, Jason Baldridge, Hinrich Schütze

We take language to be a part of a system for understanding and communicating about situations.

Paper
Add Code

Human Instruction-Following with Deep Reinforcement Learning via Transfer-Learning from Text

no code implementations • 19 May 2020 • Felix Hill, Sona Mokra, Nathaniel Wong, Tim Harley

Here, we propose a conceptually simple method for training instruction-following agents with deep RL that are robust to natural human instructions.

Instruction Following Language Modelling +4

Paper
Add Code

Probing Emergent Semantics in Predictive Agents via Question Answering

no code implementations • ICML 2020 • Abhishek Das, Federico Carnevale, Hamza Merzic, Laura Rimell, Rosalia Schneider, Josh Abramson, Alden Hung, Arun Ahuja, Stephen Clark, Gregory Wayne, Felix Hill

Recent work has shown how predictive modeling can endow agents with rich knowledge of their surroundings, improving their ability to act in complex environments.

Question Answering

Paper
Add Code

Grounded Language Learning Fast and Slow

2 code implementations • ICLR 2021 • Felix Hill, Olivier Tieleman, Tamara von Glehn, Nathaniel Wong, Hamza Merzic, Stephen Clark

Recent work has shown that large text-based neural language models, trained with conventional supervised learning objectives, acquire a surprising propensity for few- and one-shot learning.

Grounded language learning Meta-Learning +1

7,021

Paper
Code

Imitating Interactive Intelligence

no code implementations • 10 Dec 2020 • Josh Abramson, Arun Ahuja, Iain Barr, Arthur Brussee, Federico Carnevale, Mary Cassin, Rachita Chhaparia, Stephen Clark, Bogdan Damoc, Andrew Dudzik, Petko Georgiev, Aurelia Guy, Tim Harley, Felix Hill, Alden Hung, Zachary Kenton, Jessica Landon, Timothy Lillicrap, Kory Mathewson, Soňa Mokrá, Alistair Muldal, Adam Santoro, Nikolay Savinov, Vikrant Varma, Greg Wayne, Duncan Williams, Nathaniel Wong, Chen Yan, Rui Zhu

These evaluations convincingly demonstrate that interactive training and auxiliary losses improve agent behaviour beyond what is achieved by supervised learning of actions alone.

Paper
Add Code

Attention over learned object embeddings enables complex visual reasoning

1 code implementation • NeurIPS 2021 • David Ding, Felix Hill, Adam Santoro, Malcolm Reynolds, Matt Botvinick

Neural networks have achieved success in a wide array of perceptual tasks but often fail at tasks involving both perception and higher-level reasoning.

Ranked #4 on Video Object Tracking on CATER

Object Video Object Tracking +1

12,788

Paper
Code

Neural spatio-temporal reasoning with object-centric self-supervised learning

no code implementations • 1 Jan 2021 • David Ding, Felix Hill, Adam Santoro, Matthew Botvinick

Transformer-based language models have proved capable of rudimentary symbolic reasoning, underlining the effectiveness of applying self-attention computations to sets of discrete entities.

Language Modelling Self-Supervised Learning

Paper
Add Code

Towards mental time travel: a hierarchical memory for reinforcement learning agents

3 code implementations • NeurIPS 2021 • Andrew Kyle Lampinen, Stephanie C. Y. Chan, Andrea Banino, Felix Hill

Agents with common memory architectures struggle to recall and integrate across multiple timesteps of a past event, or even to recall the details of a single timestep that is followed by distractor tasks.

Meta-Learning Navigate +2

12,788

Paper
Code

Multimodal Few-Shot Learning with Frozen Language Models

no code implementations • NeurIPS 2021 • Maria Tsimpoukelli, Jacob Menick, Serkan Cabi, S. M. Ali Eslami, Oriol Vinyals, Felix Hill

When trained at sufficient scale, auto-regressive language models exhibit the notable ability to learn a new language task after being prompted with just a few examples.

Ranked #11 on Visual Question Answering (VQA) on VQA v2 val

Few-Shot Learning Language Modelling +2

Paper
Add Code

Task-driven Discovery of Perceptual Schemas for Generalization in Reinforcement Learning

no code implementations • 29 Sep 2021 • Wilka Torrico Carvalho, Andrew Kyle Lampinen, Kyriacos Nikiforou, Felix Hill, Murray Shanahan

Taking inspiration from cognitive science, we term representations for reoccurring segments of an agent's experience, "perceptual schemas".

Object reinforcement-learning +1

Paper
Add Code

Tell me why!—Explanations support learning relational and causal structure

no code implementations • 29 Sep 2021 • Andrew Kyle Lampinen, Nicholas Andrew Roy, Ishita Dasgupta, Stephanie C.Y. Chan, Allison Tam, Chen Yan, Adam Santoro, Neil Charles Rabinowitz, Jane X Wang, Felix Hill

Explanations play a considerable role in human learning, especially in areas that remain major challenges for AI—forming abstractions, and learning about the relational and causal structure of the world.

Odd One Out

Paper
Add Code

BEAMetrics: A Benchmark for Language Generation Evaluation Evaluation

2 code implementations • 18 Oct 2021 • Thomas Scialom, Felix Hill

There is currently no simple, unified way to compare, analyse or evaluate metrics across a representative set of tasks.

General Knowledge Informativeness +1

Paper
Code

Tell me why! Explanations support learning relational and causal structure

1 code implementation • 7 Dec 2021 • Andrew K. Lampinen, Nicholas A. Roy, Ishita Dasgupta, Stephanie C. Y. Chan, Allison C. Tam, James L. McClelland, Chen Yan, Adam Santoro, Neil C. Rabinowitz, Jane X. Wang, Felix Hill

Inferring the abstract relational and causal structure of the world is a major challenge for reinforcement-learning (RL) agents.

Odd One Out Reinforcement Learning (RL)

Paper
Code

Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning

no code implementations • 7 Dec 2021 • DeepMind Interactive Agents Team, Josh Abramson, Arun Ahuja, Arthur Brussee, Federico Carnevale, Mary Cassin, Felix Fischer, Petko Georgiev, Alex Goldin, Mansi Gupta, Tim Harley, Felix Hill, Peter C Humphreys, Alden Hung, Jessica Landon, Timothy Lillicrap, Hamza Merzic, Alistair Muldal, Adam Santoro, Guy Scully, Tamara von Glehn, Greg Wayne, Nathaniel Wong, Chen Yan, Rui Zhu

A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language.

Imitation Learning Self-Supervised Learning

Paper
Add Code

Feature-Attending Recurrent Modules for Generalization in Reinforcement Learning

1 code implementation • 15 Dec 2021 • Wilka Carvalho, Andrew Lampinen, Kyriacos Nikiforou, Felix Hill, Murray Shanahan

Many important tasks are defined in terms of object.

Object reinforcement-learning +1

Paper
Code

Zipfian environments for Reinforcement Learning

1 code implementation • 15 Mar 2022 • Stephanie C. Y. Chan, Andrew K. Lampinen, Pierre H. Richemond, Felix Hill

As humans and animals learn in the natural world, they encounter distributions of entities, situations and events that are far from uniform.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Can language models learn from explanations in context?

no code implementations • 5 Apr 2022 • Andrew K. Lampinen, Ishita Dasgupta, Stephanie C. Y. Chan, Kory Matthewson, Michael Henry Tessler, Antonia Creswell, James L. McClelland, Jane X. Wang, Felix Hill

In summary, explanations can support the in-context learning of large LMs on challenging tasks.

In-Context Learning

Paper
Add Code

Semantic Exploration from Language Abstractions and Pretrained Representations

no code implementations • 8 Apr 2022 • Allison C. Tam, Neil C. Rabinowitz, Andrew K. Lampinen, Nicholas A. Roy, Stephanie C. Y. Chan, DJ Strouse, Jane X. Wang, Andrea Banino, Felix Hill

We show that these pretrained representations drive meaningful, task-relevant exploration and improve performance on 3D simulated environments.

Image Captioning Reinforcement Learning (RL)

Paper
Add Code

Data Distributional Properties Drive Emergent In-Context Learning in Transformers

4 code implementations • 22 Apr 2022 • Stephanie C. Y. Chan, Adam Santoro, Andrew K. Lampinen, Jane X. Wang, Aaditya Singh, Pierre H. Richemond, Jay McClelland, Felix Hill

In further experiments, we found that naturalistic data distributions were only able to elicit in-context learning in transformers, and not in recurrent models.

Few-Shot Learning In-Context Learning

Paper
Code

Know your audience: specializing grounded language models with listener subtraction

no code implementations • 16 Jun 2022 • Aaditya K. Singh, David Ding, Andrew Saxe, Felix Hill, Andrew K. Lampinen

Through controlled experiments, we show that training a speaker with two listeners that perceive differently, using our method, allows the speaker to adapt to the idiosyncracies of the listeners.

Language Modelling Large Language Model

Paper
Add Code

Language models show human-like content effects on reasoning tasks

1 code implementation • 14 Jul 2022 • Ishita Dasgupta, Andrew K. Lampinen, Stephanie C. Y. Chan, Hannah R. Sheahan, Antonia Creswell, Dharshan Kumaran, James L. McClelland, Felix Hill

We evaluate state of the art large language models, as well as humans, and find that the language models reflect many of the same patterns observed in humans across these tasks $\unicode{x2014}$ like humans, models answer more accurately when the semantic content of a task supports the logical inferences.

Language Modelling Logical Reasoning +2

Paper
Code

Meaning without reference in large language models

no code implementations • 5 Aug 2022 • Steven T. Piantadosi, Felix Hill

The widespread success of large language models (LLMs) has been met with skepticism that they possess anything like human concepts or meanings.

Paper
Add Code

Transformers generalize differently from information stored in context vs in weights

no code implementations • 11 Oct 2022 • Stephanie C. Y. Chan, Ishita Dasgupta, Junkyung Kim, Dharshan Kumaran, Andrew K. Lampinen, Felix Hill

In transformers trained on controlled stimuli, we find that generalization from weights is more rule-based whereas generalization from context is largely exemplar-based.

In-Context Learning

Paper
Add Code

SemPPL: Predicting pseudo-labels for better contrastive representations

2 code implementations • 12 Jan 2023 • Matko Bošnjak, Pierre H. Richemond, Nenad Tomasev, Florian Strub, Jacob C. Walker, Felix Hill, Lars Holger Buesing, Razvan Pascanu, Charles Blundell, Jovana Mitrovic

We propose a new semi-supervised learning method, Semantic Positives via Pseudo-Labels (SemPPL), that combines labelled and unlabelled data to learn informative representations.

Contrastive Learning Pseudo Label

Paper
Code

Collaborating with language models for embodied reasoning

no code implementations • 1 Feb 2023 • Ishita Dasgupta, Christine Kaeser-Chen, Kenneth Marino, Arun Ahuja, Sheila Babayan, Felix Hill, Rob Fergus

On the other hand, Large Scale Language Models (LSLMs) have exhibited strong reasoning ability and the ability to to adapt to new tasks through in-context learning.

In-Context Learning Language Modelling +2

Paper
Add Code

The Edge of Orthogonality: A Simple View of What Makes BYOL Tick

no code implementations • 9 Feb 2023 • Pierre H. Richemond, Allison Tam, Yunhao Tang, Florian Strub, Bilal Piot, Felix Hill

With simple linear algebra, we show that when using a linear predictor, the optimal predictor is close to an orthogonal projection, and propose a general framework based on orthonormalization that enables to interpret and give intuition on why BYOL works.

Paper
Add Code

Vision-Language Models as Success Detectors

no code implementations • 13 Mar 2023 • Yuqing Du, Ksenia Konyushkova, Misha Denil, Akhil Raju, Jessica Landon, Felix Hill, Nando de Freitas, Serkan Cabi

Detecting successful behaviour is crucial for training intelligent agents.

Question Answering Visual Question Answering

Paper
Add Code

The Transient Nature of Emergent In-Context Learning in Transformers

2 code implementations • NeurIPS 2023 • Aaditya K. Singh, Stephanie C. Y. Chan, Ted Moskovitz, Erin Grant, Andrew M. Saxe, Felix Hill

The transient nature of ICL is observed in transformers across a range of model sizes and datasets, raising the question of how much to "overtrain" transformers when seeking compact, cheaper-to-run models.

Bayesian Inference In-Context Learning +1

Paper
Code

SODA: Bottleneck Diffusion Models for Representation Learning

1 code implementation • 29 Nov 2023 • Drew A. Hudson, Daniel Zoran, Mateusz Malinowski, Andrew K. Lampinen, Andrew Jaegle, James L. McClelland, Loic Matthey, Felix Hill, Alexander Lerchner

We introduce SODA, a self-supervised diffusion model, designed for representation learning.

Denoising Image Generation +3

Paper
Code

Scaling Instructable Agents Across Many Simulated Worlds

no code implementations • 13 Mar 2024 • SIMA Team, Maria Abi Raad, Arun Ahuja, Catarina Barros, Frederic Besse, Andrew Bolt, Adrian Bolton, Bethanie Brownfield, Gavin Buttimore, Max Cant, Sarah Chakera, Stephanie C. Y. Chan, Jeff Clune, Adrian Collister, Vikki Copeman, Alex Cullum, Ishita Dasgupta, Dario de Cesare, Julia Di Trapani, Yani Donchev, Emma Dunleavy, Martin Engelcke, Ryan Faulkner, Frankie Garcia, Charles Gbadamosi, Zhitao Gong, Lucy Gonzales, Kshitij Gupta, Karol Gregor, Arne Olav Hallingstad, Tim Harley, Sam Haves, Felix Hill, Ed Hirst, Drew A. Hudson, Jony Hudson, Steph Hughes-Fitt, Danilo J. Rezende, Mimi Jasarevic, Laura Kampis, Rosemary Ke, Thomas Keck, Junkyung Kim, Oscar Knagg, Kavya Kopparapu, Andrew Lampinen, Shane Legg, Alexander Lerchner, Marjorie Limont, YuLan Liu, Maria Loks-Thompson, Joseph Marino, Kathryn Martin Cussons, Loic Matthey, Siobhan Mcloughlin, Piermaria Mendolicchio, Hamza Merzic, Anna Mitenkova, Alexandre Moufarek, Valeria Oliveira, Yanko Oliveira, Hannah Openshaw, Renke Pan, Aneesh Pappu, Alex Platonov, Ollie Purkiss, David Reichert, John Reid, Pierre Harvey Richemond, Tyson Roberts, Giles Ruscoe, Jaume Sanchez Elias, Tasha Sandars, Daniel P. Sawyer, Tim Scholtes, Guy Simmons, Daniel Slater, Hubert Soyer, Heiko Strathmann, Peter Stys, Allison C. Tam, Denis Teplyashin, Tayfun Terzi, Davide Vercelli, Bojan Vujatovic, Marcus Wainwright, Jane X. Wang, Zhengdong Wang, Daan Wierstra, Duncan Williams, Nathaniel Wong, Sarah York, Nick Young

Building embodied AI systems that can follow arbitrary language instructions in any 3D environment is a key challenge for creating general AI.

Paper
Add Code

What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation

3 code implementations • 10 Apr 2024 • Aaditya K. Singh, Ted Moskovitz, Felix Hill, Stephanie C. Y. Chan, Andrew M. Saxe

By clamping subsets of activations throughout training, we then identify three underlying subcircuits that interact to drive IH formation, yielding the phase change.

In-Context Learning

140

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.