1 code implementation • 29 May 2025 • Daniel Wurgaft, Ben Prystawski, Kanishk Gandhi, Cedegao E. Zhang, Joshua B. Tenenbaum, Noah D. Goodman
The think-aloud method, where participants voice their thoughts as they solve a task, is a valuable source of rich data about human reasoning processes.
1 code implementation • 3 Mar 2025 • Kanishk Gandhi, Ayush Chakravarthy, Anikait Singh, Nathan Lile, Noah D. Goodman
In systematic experimentation with controlled behavioral datasets, we find that priming Llama with examples containing these reasoning behaviors enables substantial improvements during RL, matching or exceeding Qwen's performance.
1 code implementation • 24 Feb 2025 • Alon Albalak, Duy Phung, Nathan Lile, Rafael Rafailov, Kanishk Gandhi, Louis Castricato, Anikait Singh, Chase Blagden, Violet Xiang, Dakota Mahan, Nick Haber
However, existing open math datasets either contain a small collection of high-quality, human-written problems or a large corpus of machine-generated problems of uncertain quality, forcing researchers to choose between quality and quantity.
no code implementations • 8 Jan 2025 • Violet Xiang, Charlie Snell, Kanishk Gandhi, Alon Albalak, Anikait Singh, Chase Blagden, Duy Phung, Rafael Rafailov, Nathan Lile, Dakota Mahan, Louis Castricato, Jan-Philipp Franken, Nick Haber, Chelsea Finn
We propose a novel framework, Meta Chain-of-Thought (Meta-CoT), which extends traditional Chain-of-Thought (CoT) by explicitly modeling the underlying reasoning required to arrive at a particular CoT.
1 code implementation • 2 Jan 2025 • Kanishk Gandhi, Michael Y. Li, Lyle Goodyear, Louise Li, Aditi Bhaskar, Mohammed Zaman, Noah D. Goodman
To quantitatively evaluate a scientific agent's ability to collect informative experimental data, we compute the expected information gain (EIG), an information-theoretic quantity which measures how much an experiment reduces uncertainty about the parameters of a generative model.
no code implementations • 4 Dec 2024 • Alex Havrilla, Andrew Dai, Laura O'Mahony, Koen Oostermeijer, Vera Zisler, Alon Albalak, Fabrizio Milo, Sharath Chandra Raparthy, Kanishk Gandhi, Baber Abbasi, Duy Phung, Maia Iyer, Dakota Mahan, Chase Blagden, Srishti Gureja, Mohammed Hamdy, Wen-Ding Li, Giovanni Paolini, Pawan Sasanka Ammanamanchi, Elliot Meyerson
Further, we emphasize the existence of Quality-Diversity trade-offs in training data and the downstream effects on model performance.
no code implementations • 18 Sep 2024 • Kanishk Gandhi, Zoe Lynch, Jan-Philipp Fränken, Kayla Patterson, Sharon Wambu, Tobias Gerstenberg, Desmond C. Ong, Noah D. Goodman
Understanding emotions is fundamental to human interaction and experience.
1 code implementation • 22 Jul 2024 • Joy He-Yueya, Wanjing Anya Ma, Kanishk Gandhi, Benjamin W. Domingue, Emma Brunskill, Noah D. Goodman
We demonstrate that our metric can capture important variations in populations that traditional metrics, like differences in accuracy, fail to capture.
1 code implementation • 22 Apr 2024 • Jan-Philipp Fränken, Eric Zelikman, Rafael Rafailov, Kanishk Gandhi, Tobias Gerstenberg, Noah D. Goodman
On single-turn dialogue and summarization, a SAMI-trained mistral-7b outperforms the initial pretrained model, with win rates between 66% and 77%.
1 code implementation • 17 Apr 2024 • Jan-Philipp Fränken, Kanishk Gandhi, Tori Qiu, Ayesha Khawaja, Noah D. Goodman, Tobias Gerstenberg
We collected moral permissibility and intention judgments from human participants for a subset of our items and compared these judgments to those from two language models (GPT-4 and Claude-2) across eight conditions.
1 code implementation • 1 Apr 2024 • Kanishk Gandhi, Denise Lee, Gabriel Grand, Muxin Liu, Winson Cheng, Archit Sharma, Noah D. Goodman
In this paper, we show how language models can be taught to search by representing the process of search in language, as a flattened string -- a stream of search (SoS).
1 code implementation • 26 Oct 2023 • Jan-Philipp Fränken, Sam Kwok, Peixuan Ye, Kanishk Gandhi, Dilip Arumugam, Jared Moore, Alex Tamkin, Tobias Gerstenberg, Noah D. Goodman
We explore the idea of aligning an AI assistant by inverting a model of users' (unknown) preferences from observed interactions.
no code implementations • 27 Sep 2023 • YuAn Wang, Surya T. Sathujoda, Krzysztof Sawicki, Kanishk Gandhi, Angelica I Aviles-Rivero, Pavlos G. Lagoudakis
A plethora of next-generation all-optical devices based on exciton-polaritons have been proposed in latest years, including prototypes of transistors, switches, analogue quantum simulators and others.
no code implementations • NeurIPS 2023 • Kanishk Gandhi, Jan-Philipp Fränken, Tobias Gerstenberg, Noah D. Goodman
Using our framework, we create a new social reasoning benchmark (BigToM) for LLMs which consists of 25 controls and 5, 000 model-written evaluations.
1 code implementation • 6 Jun 2023 • Gabriel Poesia, Kanishk Gandhi, Eric Zelikman, Noah D. Goodman
In experiments on PrOntoQA, ProofWriter and Syllogism Validity datasets, \textsc{LogicGuide} significantly improves the performance of GPT-3, GPT-3. 5 Turbo and LLaMA (accuracy gains up to 35\%), while drastically reducing \emph{content effects} -- the interference between unwanted prior assumptions and reasoning, which humans and language models suffer from.
no code implementations • 30 May 2023 • Kanishk Gandhi, Dorsa Sadigh, Noah D. Goodman
Existing approaches to solving strategic games rely on extensive training, yielding strategies that do not generalize to new scenarios or games without retraining.
no code implementations • 14 Oct 2022 • Kanishk Gandhi, Siddharth Karamcheti, Madeline Liao, Dorsa Sadigh
Imitation learning from human-provided demonstrations is a strong approach for learning policies for robot manipulation.
no code implementations • NeurIPS 2021 • Kanishk Gandhi, Gala Stojnic, Brenden M. Lake, Moira R. Dillon
To achieve human-like common sense about everyday life, machine learning systems must understand and reason about the goals, preferences, and actions of other agents in the environment.
no code implementations • NeurIPS 2020 • Kanishk Gandhi, Brenden M. Lake
Strong inductive biases allow children to learn in fast and adaptable ways.