Search Results for author: Gabriel Poesia

Found 16 papers, 11 papers with code

Formal Mathematical Reasoning: A New Frontier in AI

no code implementations20 Dec 2024 Kaiyu Yang, Gabriel Poesia, Jingxuan He, Wenda Li, Kristin Lauter, Swarat Chaudhuri, Dawn Song

AI for Mathematics (AI4Math) is not only intriguing intellectually but also crucial for AI-driven discovery in science, engineering, and beyond.

Automated Theorem Proving Math +1

Data for Mathematical Copilots: Better Ways of Presenting Proofs for Machine Learning

no code implementations19 Dec 2024 Simon Frieder, Jonas Bayer, Katherine M. Collins, Julius Berner, Jacob Loader, András Juhász, Fabian Ruehle, Sean Welleck, Gabriel Poesia, Ryan-Rhys Griffiths, Adrian Weller, Anirudh Goyal, Thomas Lukasiewicz, Timothy Gowers

The suite of datasets commonly used to train and evaluate the mathematical capabilities of AI-based mathematical copilots (primarily large language models) exhibit several shortcomings.

Math

dafny-annotator: AI-Assisted Verification of Dafny Programs

no code implementations5 Nov 2024 Gabriel Poesia, Chloe Loughridge, Nada Amin

Since this data-driven approach is hindered by the lack of large-scale training data, we propose a method for open-ended synthesis of new Dafny programs in a flexible pipeline where LLMs formulate high-level ideas, implement them, and incrementally propose changes to existing programs, which Dafny validates.

Friction

h4rm3l: A language for Composable Jailbreak Attack Synthesis

no code implementations9 Aug 2024 Moussa Koulako Bala Doumbouya, Ananjan Nandi, Gabriel Poesia, Davide Ghilardi, Anna Goldie, Federico Bianchi, Dan Jurafsky, Christopher D. Manning

We demonstrate h4rm3l's efficacy by synthesizing a dataset of 2656 successful novel jailbreak attacks targeting 6 SOTA open-source and proprietary LLMs, and by benchmarking those models against a subset of these synthesized attacks.

Benchmarking Program Synthesis +1

MathCAMPS: Fine-grained Synthesis of Mathematical Problems From Human Curricula

1 code implementation1 Jul 2024 Shubhra Mishra, Gabriel Poesia, Belinda Mo, Noah D. Goodman

Mathematical problem solving is an important skill for Large Language Models (LLMs), both as an important capability and a proxy for a range of reasoning abilities.

Mathematical Problem-Solving

Learning Formal Mathematics From Intrinsic Motivation

2 code implementations30 Jun 2024 Gabriel Poesia, David Broman, Nick Haber, Noah D. Goodman

We propose novel methods for hindsight relabeling on proof search trees to significantly improve the agent's sample efficiency in both tasks.

Automated Theorem Proving Language Modeling +2

When Do Skills Help Reinforcement Learning? A Theoretical Analysis of Temporal Abstractions

1 code implementation12 Jun 2024 Zhening Li, Gabriel Poesia, Armando Solar-Lezama

Skills are temporal abstractions that are intended to improve reinforcement learning (RL) performance through hierarchical RL.

Reinforcement Learning (RL)

Hypothesis Search: Inductive Reasoning with Language Models

1 code implementation11 Sep 2023 Ruocheng Wang, Eric Zelikman, Gabriel Poesia, Yewen Pu, Nick Haber, Noah D. Goodman

Inductive reasoning is a core problem-solving capacity: humans can identify underlying principles from a few examples, which robustly generalize to novel scenarios.

ARC In-Context Learning

Certified Deductive Reasoning with Language Models

1 code implementation6 Jun 2023 Gabriel Poesia, Kanishk Gandhi, Eric Zelikman, Noah D. Goodman

In experiments on PrOntoQA, ProofWriter and Syllogism Validity datasets, \textsc{LogicGuide} significantly improves the performance of GPT-3, GPT-3. 5 Turbo and LLaMA (accuracy gains up to 35\%), while drastically reducing \emph{content effects} -- the interference between unwanted prior assumptions and reasoning, which humans and language models suffer from.

Logical Reasoning valid

Solving Math Word Problems by Combining Language Models With Symbolic Solvers

1 code implementation16 Apr 2023 Joy He-Yueya, Gabriel Poesia, Rose E. Wang, Noah D. Goodman

Automatically generating high-quality step-by-step solutions to math word problems has many applications in education.

GSM8K Language Modeling +2

Parsel: Algorithmic Reasoning with Language Models by Composing Decompositions

1 code implementation20 Dec 2022 Eric Zelikman, Qian Huang, Gabriel Poesia, Noah D. Goodman, Nick Haber

Despite recent success in large language model (LLM) reasoning, LLMs struggle with hierarchical multi-step reasoning tasks like generating complex programs.

Automated Theorem Proving Code Generation +5

Peano: Learning Formal Mathematical Reasoning

1 code implementation29 Nov 2022 Gabriel Poesia, Noah D. Goodman

We explore this idea in a case study on 5 sections of beginning algebra on the Khan Academy platform.

Automated Theorem Proving Mathematical Reasoning +1

Synchromesh: Reliable code generation from pre-trained language models

2 code implementations ICLR 2022 Gabriel Poesia, Oleksandr Polozov, Vu Le, Ashish Tiwari, Gustavo Soares, Christopher Meek, Sumit Gulwani

Then, Synchromesh feeds the examples to a pre-trained language model and samples programs using Constrained Semantic Decoding (CSD): a general framework for constraining the output to a set of valid programs in the target language.

Code Generation Language Modeling +2

Contrastive Reinforcement Learning of Symbolic Reasoning Domains

2 code implementations NeurIPS 2021 Gabriel Poesia, WenXin Dong, Noah Goodman

Our results suggest new directions for reinforcement learning in symbolic domains, as well as applications to mathematics education.

reinforcement-learning Reinforcement Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.