Search Results for author: Rahul Ramesh

Found 13 papers, 6 papers with code

Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model

no code implementations • 12 Feb 2024 • Mikail Khona, Maya Okawa, Jan Hula, Rahul Ramesh, Kento Nishi, Robert Dick, Ekdeep Singh Lubana, Hidenori Tanaka

Stepwise inference protocols, such as scratchpads and chain-of-thought, help language models solve complex problems by decomposing them into a sequence of simpler subproblems.

Paper
Add Code

Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks

no code implementations • 21 Nov 2023 • Rahul Ramesh, Ekdeep Singh Lubana, Mikail Khona, Robert P. Dick, Hidenori Tanaka

Transformers trained on huge text corpora exhibit a remarkable set of capabilities, e. g., performing basic arithmetic.

Paper
Add Code

The Training Process of Many Deep Networks Explores the Same Low-Dimensional Manifold

2 code implementations • 2 May 2023 • Jialin Mao, Itay Griniasty, Han Kheng Teoh, Rahul Ramesh, Rubing Yang, Mark K. Transtrum, James P. Sethna, Pratik Chaudhari

We develop information-geometric techniques to analyze the trajectories of the predictions of deep networks during training.

Data Augmentation

Paper
Code

A picture of the space of typical learnable tasks

2 code implementations • 31 Oct 2022 • Rahul Ramesh, Jialin Mao, Itay Griniasty, Rubing Yang, Han Kheng Teoh, Mark Transtrum, James P. Sethna, Pratik Chaudhari

We develop information geometric techniques to understand the representations learned by deep networks when they are trained on different tasks using supervised, meta-, semi-supervised and contrastive learning.

Contrastive Learning Meta-Learning +1

Paper
Code

The Value of Out-of-Distribution Data

1 code implementation • 23 Aug 2022 • Ashwin De Silva, Rahul Ramesh, Carey E. Priebe, Pratik Chaudhari, Joshua T. Vogelstein

In this work, we show a counter-intuitive phenomenon: the generalization error of a task can be a non-monotonic function of the number of OOD samples.

Data Augmentation Hyperparameter Optimization

Paper
Code

Deep Reference Priors: What is the best way to pretrain a model?

2 code implementations • pproximateinference AABI Symposium 2022 • Yansong Gao, Rahul Ramesh, Pratik Chaudhari

Such priors enable the task to maximally affect the Bayesian posterior, e. g., reference priors depend upon the number of samples available for learning the task and for very small sample sizes, the prior puts more probability mass on low-complexity models in the hypothesis space.

Semi-Supervised Image Classification Transfer Learning

Paper
Code

Prospective Learning: Principled Extrapolation to the Future

no code implementations • 19 Jan 2022 • Ashwin De Silva, Rahul Ramesh, Lyle Ungar, Marshall Hussain Shuler, Noah J. Cowan, Michael Platt, Chen Li, Leyla Isik, Seung-Eon Roh, Adam Charles, Archana Venkataraman, Brian Caffo, Javier J. How, Justus M Kebschull, John W. Krakauer, Maxim Bichuch, Kaleab Alemayehu Kinfu, Eva Yezerets, Dinesh Jayaraman, Jong M. Shin, Soledad Villar, Ian Phillips, Carey E. Priebe, Thomas Hartung, Michael I. Miller, Jayanta Dey, Ningyuan, Huang, Eric Eaton, Ralph Etienne-Cummings, Elizabeth L. Ogburn, Randal Burns, Onyema Osuagwu, Brett Mensh, Alysson R. Muotri, Julia Brown, Chris White, Weiwei Yang, Andrei A. Rusu, Timothy Verstynen, Konrad P. Kording, Pratik Chaudhari, Joshua T. Vogelstein

We conjecture that certain sequences of tasks are not retrospectively learnable (in which the data distribution is fixed), but are prospectively learnable (in which distributions may be dynamic), suggesting that prospective learning is more difficult in kind than retrospective learning.

Continual Learning Decision Making

Paper
Add Code

Model Zoo: A Growing Brain That Learns Continually

no code implementations • ICLR 2022 • Rahul Ramesh, Pratik Chaudhari

This paper argues that continual learning methods can benefit by splitting the capacity of the learner across multiple models.

Continual Learning Learning Theory

Paper
Add Code

Model Zoo: A Growing "Brain" That Learns Continually

2 code implementations • 6 Jun 2021 • Rahul Ramesh, Pratik Chaudhari

We use statistical learning theory and experimental analysis to show how multiple tasks can interact with each other in a non-trivial fashion when a single model is trained on them.

Ranked #1 on Continual Learning on Rotated MNIST