Search Results for author: Mikail Khona

Found 12 papers, 1 papers with code

Bridging Associative Memory and Probabilistic Modeling

no code implementations15 Feb 2024 Rylan Schaeffer, Nika Zahedi, Mikail Khona, Dhruv Pai, Sang Truong, Yilun Du, Mitchell Ostrow, Sarthak Chandra, Andres Carranza, Ila Rani Fiete, Andrey Gromov, Sanmi Koyejo

Based on the observation that associative memory's energy functions can be seen as probabilistic modeling's negative log likelihoods, we build a bridge between the two that enables useful flow of ideas in both directions.

In-Context Learning

Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model

no code implementations12 Feb 2024 Mikail Khona, Maya Okawa, Jan Hula, Rahul Ramesh, Kento Nishi, Robert Dick, Ekdeep Singh Lubana, Hidenori Tanaka

Stepwise inference protocols, such as scratchpads and chain-of-thought, help language models solve complex problems by decomposing them into a sequence of simpler subproblems.

Disentangling Fact from Grid Cell Fiction in Trained Deep Path Integrators

no code implementations6 Dec 2023 Rylan Schaeffer, Mikail Khona, Sanmi Koyejo, Ila Rani Fiete

Work on deep learning-based models of grid cells suggests that grid cells generically and robustly arise from optimizing networks to path integrate, i. e., track one's spatial position by integrating self-velocity signals.

Testing Assumptions Underlying a Unified Theory for the Origin of Grid Cells

no code implementations27 Nov 2023 Rylan Schaeffer, Mikail Khona, Adrian Bertagnoli, Sanmi Koyejo, Ila Rani Fiete

At both the population and single-cell levels, we find evidence suggesting that neither of the assumptions are likely true in biological neural representations.

Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks

no code implementations21 Nov 2023 Rahul Ramesh, Ekdeep Singh Lubana, Mikail Khona, Robert P. Dick, Hidenori Tanaka

Transformers trained on huge text corpora exhibit a remarkable set of capabilities, e. g., performing basic arithmetic.

Growing Brains: Co-emergence of Anatomical and Functional Modularity in Recurrent Neural Networks

no code implementations11 Oct 2023 Ziming Liu, Mikail Khona, Ila R. Fiete, Max Tegmark

Recurrent neural networks (RNNs) trained on compositional tasks can exhibit functional modularity, in which neurons can be clustered by activity similarity and participation in shared computational subtasks.

Clustering

Double Descent Demystified: Identifying, Interpreting & Ablating the Sources of a Deep Learning Puzzle

1 code implementation24 Mar 2023 Rylan Schaeffer, Mikail Khona, Zachary Robertson, Akhilan Boopathy, Kateryna Pistunova, Jason W. Rocks, Ila Rani Fiete, Oluwasanmi Koyejo

Double descent is a surprising phenomenon in machine learning, in which as the number of model parameters grows relative to the number of data, test error drops as models grow ever larger into the highly overparameterized (data undersampled) regime.

Learning Theory regression

See and Copy: Generation of complex compositional movements from modular and geometric RNN representations

no code implementations5 Oct 2022 Sunny Duan, Mikail Khona, Adrian Bertagnoli, Sarthak Chandra, Ila Fiete

A hallmark of biological intelligence and control is combinatorial generalization: animals are able to learn various things, then piece them together in new combinations to produce appropriate outputs for new tasks.

Winning the lottery with neural connectivity constraints: faster learning across cognitive tasks with spatially constrained sparse RNNs

no code implementations7 Jul 2022 Mikail Khona, Sarthak Chandra, Joy J. Ma, Ila Fiete

We study LM-RNNs in a multitask learning setting relevant to cognitive systems neuroscience with a commonly used set of tasks, 20-Cog-tasks [Yang et al., 2019].

Attractor and integrator networks in the brain

no code implementations7 Dec 2021 Mikail Khona, Ila R. Fiete

In this review, we describe the singular success of attractor neural network models in describing how the brain maintains persistent activity states for working memory, error-corrects, and integrates noisy cues.

Cannot find the paper you are looking for? You can Submit a new open access paper.