Search Results for author: Aaditya K. Singh

Found 6 papers, 4 papers with code

What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation

3 code implementations10 Apr 2024 Aaditya K. Singh, Ted Moskovitz, Felix Hill, Stephanie C. Y. Chan, Andrew M. Saxe

By clamping subsets of activations throughout training, we then identify three underlying subcircuits that interact to drive IH formation, yielding the phase change.

In-Context Learning

Tokenization counts: the impact of tokenization on arithmetic in frontier LLMs

1 code implementation22 Feb 2024 Aaditya K. Singh, DJ Strouse

Tokenization, the division of input text into input tokens, is an often overlooked aspect of the large language model (LLM) pipeline and could be the source of useful or harmful inductive biases.

Inductive Bias Language Modelling +1

Decoding Data Quality via Synthetic Corruptions: Embedding-guided Pruning of Code Data

no code implementations5 Dec 2023 Yu Yang, Aaditya K. Singh, Mostafa Elhoushi, Anas Mahmoud, Kushal Tirumala, Fabian Gloeckle, Baptiste Rozière, Carole-Jean Wu, Ari S. Morcos, Newsha Ardalani

Armed with this knowledge, we devise novel pruning metrics that operate in embedding space to identify and remove low-quality entries in the Stack dataset.

Code Generation

The Transient Nature of Emergent In-Context Learning in Transformers

2 code implementations NeurIPS 2023 Aaditya K. Singh, Stephanie C. Y. Chan, Ted Moskovitz, Erin Grant, Andrew M. Saxe, Felix Hill

The transient nature of ICL is observed in transformers across a range of model sizes and datasets, raising the question of how much to "overtrain" transformers when seeking compact, cheaper-to-run models.

Bayesian Inference In-Context Learning +1

Confronting Reward Model Overoptimization with Constrained RLHF

1 code implementation6 Oct 2023 Ted Moskovitz, Aaditya K. Singh, DJ Strouse, Tuomas Sandholm, Ruslan Salakhutdinov, Anca D. Dragan, Stephen Mcaleer

Large language models are typically aligned with human preferences by optimizing $\textit{reward models}$ (RMs) fitted to human feedback.

Know your audience: specializing grounded language models with listener subtraction

no code implementations16 Jun 2022 Aaditya K. Singh, David Ding, Andrew Saxe, Felix Hill, Andrew K. Lampinen

Through controlled experiments, we show that training a speaker with two listeners that perceive differently, using our method, allows the speaker to adapt to the idiosyncracies of the listeners.

Language Modelling Large Language Model

Cannot find the paper you are looking for? You can Submit a new open access paper.