Search Results for author: Mikail Khona

Found 12 papers, 1 papers with code

Large language models surpass human experts in predicting neuroscience results

no code implementations • 4 Mar 2024 • Xiaoliang Luo, Akilles Rechardt, Guangzhi Sun, Kevin K. Nejad, Felipe Yáñez, Bati Yilmaz, Kangjoo Lee, Alexandra O. Cohen, Valentina Borghesani, Anton Pashkov, Daniele Marinazzo, Jonathan Nicholas, Alessandro Salatiello, Ilia Sucholutsky, Pasquale Minervini, Sepehr Razavi, Roberta Rocca, Elkhan Yusifov, Tereza Okalova, Nianlong Gu, Martin Ferianc, Mikail Khona, Kaustubh R. Patil, Pui-Shee Lee, Rui Mata, Nicholas E. Myers, Jennifer K Bizley, Sebastian Musslick, Isil Poyraz Bilgin, Guiomar Niso, Justin M. Ales, Michael Gaebler, N Apurva Ratan Murty, Leyla Loued-Khenissi, Anna Behler, Chloe M. Hall, Jessica Dafflon, Sherry Dongqi Bao, Bradley C. Love

LLMs trained on the vast scientific literature could potentially integrate noisy yet interrelated findings to forecast novel results better than human experts.

Paper
Add Code

Bridging Associative Memory and Probabilistic Modeling

no code implementations • 15 Feb 2024 • Rylan Schaeffer, Nika Zahedi, Mikail Khona, Dhruv Pai, Sang Truong, Yilun Du, Mitchell Ostrow, Sarthak Chandra, Andres Carranza, Ila Rani Fiete, Andrey Gromov, Sanmi Koyejo

Based on the observation that associative memory's energy functions can be seen as probabilistic modeling's negative log likelihoods, we build a bridge between the two that enables useful flow of ideas in both directions.

In-Context Learning

Paper
Add Code

Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model

no code implementations • 12 Feb 2024 • Mikail Khona, Maya Okawa, Jan Hula, Rahul Ramesh, Kento Nishi, Robert Dick, Ekdeep Singh Lubana, Hidenori Tanaka

Stepwise inference protocols, such as scratchpads and chain-of-thought, help language models solve complex problems by decomposing them into a sequence of simpler subproblems.

Paper
Add Code

Disentangling Fact from Grid Cell Fiction in Trained Deep Path Integrators

no code implementations • 6 Dec 2023 • Rylan Schaeffer, Mikail Khona, Sanmi Koyejo, Ila Rani Fiete

Work on deep learning-based models of grid cells suggests that grid cells generically and robustly arise from optimizing networks to path integrate, i. e., track one's spatial position by integrating self-velocity signals.

Paper
Add Code

Testing Assumptions Underlying a Unified Theory for the Origin of Grid Cells

no code implementations • 27 Nov 2023 • Rylan Schaeffer, Mikail Khona, Adrian Bertagnoli, Sanmi Koyejo, Ila Rani Fiete

At both the population and single-cell levels, we find evidence suggesting that neither of the assumptions are likely true in biological neural representations.

Paper
Add Code

Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks

no code implementations • 21 Nov 2023 • Rahul Ramesh, Ekdeep Singh Lubana, Mikail Khona, Robert P. Dick, Hidenori Tanaka

Transformers trained on huge text corpora exhibit a remarkable set of capabilities, e. g., performing basic arithmetic.

Paper
Add Code

Growing Brains: Co-emergence of Anatomical and Functional Modularity in Recurrent Neural Networks

no code implementations • 11 Oct 2023 • Ziming Liu, Mikail Khona, Ila R. Fiete, Max Tegmark

Recurrent neural networks (RNNs) trained on compositional tasks can exhibit functional modularity, in which neurons can be clustered by activity similarity and participation in shared computational subtasks.

Clustering

Paper
Add Code

Double Descent Demystified: Identifying, Interpreting & Ablating the Sources of a Deep Learning Puzzle

1 code implementation • 24 Mar 2023 • Rylan Schaeffer, Mikail Khona, Zachary Robertson, Akhilan Boopathy, Kateryna Pistunova, Jason W. Rocks, Ila Rani Fiete, Oluwasanmi Koyejo

Double descent is a surprising phenomenon in machine learning, in which as the number of model parameters grows relative to the number of data, test error drops as models grow ever larger into the highly overparameterized (data undersampled) regime.

Learning Theory regression

Paper
Code

See and Copy: Generation of complex compositional movements from modular and geometric RNN representations

no code implementations • 5 Oct 2022 • Sunny Duan, Mikail Khona, Adrian Bertagnoli, Sarthak Chandra, Ila Fiete

A hallmark of biological intelligence and control is combinatorial generalization: animals are able to learn various things, then piece them together in new combinations to produce appropriate outputs for new tasks.

Paper
Add Code

Winning the lottery with neural connectivity constraints: faster learning across cognitive tasks with spatially constrained sparse RNNs

no code implementations • 7 Jul 2022 • Mikail Khona, Sarthak Chandra, Joy J. Ma, Ila Fiete

We study LM-RNNs in a multitask learning setting relevant to cognitive systems neuroscience with a commonly used set of tasks, 20-Cog-tasks [Yang et al., 2019].

Paper
Add Code

Attractor and integrator networks in the brain

no code implementations • 7 Dec 2021 • Mikail Khona, Ila R. Fiete

In this review, we describe the singular success of attractor neural network models in describing how the brain maintains persistent activity states for working memory, error-corrects, and integrates noisy cues.

Paper
Add Code

Reverse-engineering recurrent neural network solutions to a hierarchical inference task for mice

no code implementations • NeurIPS 2020 • Rylan Schaeffer, Mikail Khona, Leenoy Meshulam, Brain Laboratory International, Ila Fiete

Third, the geometry of RNN dynamics reflects an induced coupling between the two separate inference processes necessary to solve the task.

Knowledge Distillation Model Compression

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.