Search Results for author: Ekdeep Singh Lubana

Found 28 papers, 14 papers with code

Projecting Assumptions: The Duality Between Sparse Autoencoders and Concept Geometry

no code implementations3 Mar 2025 Sai Sumedh R. Hindupur, Ekdeep Singh Lubana, Thomas Fel, Demba Ba

Sparse Autoencoders (SAEs) are widely used to interpret neural networks by identifying meaningful concepts from their representations.

Bilevel Optimization

Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models

no code implementations18 Feb 2025 Thomas Fel, Ekdeep Singh Lubana, Jacob S. Prince, Matthew Kowal, Victor Boutin, Isabel Papadimitriou, Binxu Wang, Martin Wattenberg, Demba Ba, Talia Konkle

Sparse Autoencoders (SAEs) have emerged as a powerful framework for machine learning interpretability, enabling the unsupervised decomposition of model representations into a dictionary of abstract, human-interpretable concepts.

Dictionary Learning

ICLR: In-Context Learning of Representations

no code implementations29 Dec 2024 Core Francisco Park, Andrew Lee, Ekdeep Singh Lubana, Yongyi Yang, Maya Okawa, Kento Nishi, Martin Wattenberg, Hidenori Tanaka

Specifically, if we provide in-context exemplars wherein a concept plays a different role than what the pretraining data suggests, do models reorganize their representations in accordance with these novel semantics?

In-Context Learning Large Language Model

Competition Dynamics Shape Algorithmic Phases of In-Context Learning

1 code implementation1 Dec 2024 Core Francisco Park, Ekdeep Singh Lubana, Itamar Pres, Hidenori Tanaka

In-Context Learning (ICL) has significantly expanded the general-purpose nature of large language models, allowing them to adapt to novel tasks using merely the inputted context.

In-Context Learning

Abrupt Learning in Transformers: A Case Study on Matrix Completion

no code implementations29 Oct 2024 Pulkit Gopalani, Ekdeep Singh Lubana, Wei Hu

We also analyze the training dynamics of individual model components to understand the sudden drop in loss.

Language Modeling Language Modelling +2

Towards Reliable Evaluation of Behavior Steering Interventions in LLMs

no code implementations22 Oct 2024 Itamar Pres, Laura Ruis, Ekdeep Singh Lubana, David Krueger

Representation engineering methods have recently shown promise for enabling efficient steering of model behavior.

Representation Shattering in Transformers: A Synthetic Study with Knowledge Editing

no code implementations22 Oct 2024 Kento Nishi, Maya Okawa, Rahul Ramesh, Mikail Khona, Hidenori Tanaka, Ekdeep Singh Lubana

We call this phenomenon representation shattering and demonstrate that it results in degradation of factual recall and reasoning performance more broadly.

knowledge editing Mamba

Analyzing (In)Abilities of SAEs via Formal Languages

no code implementations15 Oct 2024 Abhinav Menon, Manish Shrivastava, David Krueger, Ekdeep Singh Lubana

Autoencoders have been used for finding interpretable and disentangled features underlying neural network representations in both image and text domains.

Dynamics of Concept Learning and Compositional Generalization

no code implementations10 Oct 2024 Yongyi Yang, Core Francisco Park, Ekdeep Singh Lubana, Maya Okawa, Wei Hu, Hidenori Tanaka

We mathematically analyze the learning dynamics of neural networks trained on this SIM task and show that, despite its simplicity, SIM's learning dynamics capture and help explain key empirical observations on compositional generalization with diffusion models identified in prior work.

A Percolation Model of Emergence: Analyzing Transformers Trained on a Formal Language

1 code implementation22 Aug 2024 Ekdeep Singh Lubana, Kyogo Kawaguchi, Robert P. Dick, Hidenori Tanaka

We empirically investigate this definition by proposing an experimental system grounded in a context-sensitive formal language and find that Transformers trained to perform tasks on top of strings from this language indeed exhibit emergent capabilities.

Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space

1 code implementation27 Jun 2024 Core Francisco Park, Maya Okawa, Andrew Lee, Hidenori Tanaka, Ekdeep Singh Lubana

Modern generative models demonstrate impressive capabilities, likely stemming from an ability to identify and manipulate abstract concepts underlying their training data.

Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model

no code implementations12 Feb 2024 Mikail Khona, Maya Okawa, Jan Hula, Rahul Ramesh, Kento Nishi, Robert Dick, Ekdeep Singh Lubana, Hidenori Tanaka

Stepwise inference protocols, such as scratchpads and chain-of-thought, help language models solve complex problems by decomposing them into a sequence of simpler subproblems.

Diversity

FoMo Rewards: Can we cast foundation models as reward functions?

no code implementations6 Dec 2023 Ekdeep Singh Lubana, Johann Brehmer, Pim de Haan, Taco Cohen

We explore the viability of casting foundation models as generic reward functions for reinforcement learning.

Language Modeling Language Modelling +1

Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks

no code implementations21 Nov 2023 Samyak Jain, Robert Kirk, Ekdeep Singh Lubana, Robert P. Dick, Hidenori Tanaka, Edward Grefenstette, Tim Rocktäschel, David Scott Krueger

Fine-tuning large pre-trained models has become the de facto strategy for developing both task-specific and general-purpose machine learning systems, including developing models that are safe to deploy.

Network Pruning

Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks

1 code implementation21 Nov 2023 Rahul Ramesh, Ekdeep Singh Lubana, Mikail Khona, Robert P. Dick, Hidenori Tanaka

Transformers trained on huge text corpora exhibit a remarkable set of capabilities, e. g., performing basic arithmetic.

In-Context Learning Dynamics with Random Binary Sequences

1 code implementation26 Oct 2023 Eric J. Bigelow, Ekdeep Singh Lubana, Robert P. Dick, Hidenori Tanaka, Tomer D. Ullman

Large language models (LLMs) trained on huge corpora of text datasets demonstrate intriguing capabilities, achieving state-of-the-art performance on tasks they were not explicitly trained for.

In-Context Learning

Compositional Abilities Emerge Multiplicatively: Exploring Diffusion Models on a Synthetic Task

1 code implementation NeurIPS 2023 Maya Okawa, Ekdeep Singh Lubana, Robert P. Dick, Hidenori Tanaka

Motivated by this, we perform a controlled study for understanding compositional generalization in conditional diffusion models in a synthetic setting, varying different attributes of the training data and measuring the model's ability to generate samples out-of-distribution.

Mechanistic Mode Connectivity

1 code implementation15 Nov 2022 Ekdeep Singh Lubana, Eric J. Bigelow, Robert P. Dick, David Krueger, Hidenori Tanaka

We study neural network loss landscapes through the lens of mode connectivity, the observation that minimizers of neural networks retrieved via training on a dataset are connected via simple paths of low loss.

What shapes the loss landscape of self-supervised learning?

no code implementations2 Oct 2022 Liu Ziyin, Ekdeep Singh Lubana, Masahito Ueda, Hidenori Tanaka

Prevention of complete and dimensional collapse of representations has recently become a design principle for self-supervised learning (SSL).

Self-Supervised Learning

Analyzing Data-Centric Properties for Graph Contrastive Learning

1 code implementation4 Aug 2022 Puja Trivedi, Ekdeep Singh Lubana, Mark Heimann, Danai Koutra, Jayaraman J. Thiagarajan

Overall, our work rigorously contextualizes, both empirically and theoretically, the effects of data-centric properties on augmentation strategies and learning paradigms for graph SSL.

Contrastive Learning Self-Supervised Learning +1

Augmentations in Graph Contrastive Learning: Current Methodological Flaws & Towards Better Practices

no code implementations5 Nov 2021 Puja Trivedi, Ekdeep Singh Lubana, Yujun Yan, Yaoqing Yang, Danai Koutra

Unsupervised graph representation learning is critical to a wide range of applications where labels may be scarce or expensive to procure.

Contrastive Learning Data Augmentation +5

How do Quadratic Regularizers Prevent Catastrophic Forgetting: The Role of Interpolation

2 code implementations4 Feb 2021 Ekdeep Singh Lubana, Puja Trivedi, Danai Koutra, Robert P. Dick

Catastrophic forgetting undermines the effectiveness of deep neural networks (DNNs) in scenarios such as continual learning and lifelong learning.

Continual Learning

A Gradient Flow Framework For Analyzing Network Pruning

1 code implementation ICLR 2021 Ekdeep Singh Lubana, Robert P. Dick

We use this framework to determine the relationship between pruning measures and evolution of model parameters, establishing several results related to pruning models early-on in training: (i) magnitude-based pruning removes parameters that contribute least to reduction in loss, resulting in models that converge faster than magnitude-agnostic methods; (ii) loss-preservation based pruning preserves first-order model evolution dynamics and is therefore appropriate for pruning minimally trained models; and (iii) gradient-norm based pruning affects second-order model evolution dynamics, such that increasing gradient norm via pruning can produce poorly performing models.

Network Pruning

OrthoReg: Robust Network Pruning Using Orthonormality Regularization

1 code implementation10 Sep 2020 Ekdeep Singh Lubana, Puja Trivedi, Conrad Hougen, Robert P. Dick, Alfred O. Hero

To address this issue, we propose OrthoReg, a principled regularization strategy that enforces orthonormality on a network's filters to reduce inter-filter correlation, thereby allowing reliable, efficient determination of group importance estimates, improved trainability of pruned networks, and efficient, simultaneous pruning of large groups of filters.

Network Pruning

Cannot find the paper you are looking for? You can Submit a new open access paper.