Automated Theorem Proving

70 papers with code • 10 benchmarks • 8 datasets

The goal of Automated Theorem Proving is to automatically generate a proof, given a conjecture (the target theorem) and a knowledge base of known facts, all expressed in a formal language. Automated Theorem Proving is useful in a wide range of applications, including the verification and synthesis of software and hardware systems.

Source: Learning to Prove Theorems by Learning to Generate Theorems

Libraries

Use these libraries to find Automated Theorem Proving models and implementations
2 papers
6,574

Latest papers with no code

Learn from Failure: Fine-Tuning LLMs with Trial-and-Error Data for Intuitionistic Propositional Logic Proving

no code yet • 10 Apr 2024

Recent advances in Automated Theorem Proving have shown the effectiveness of leveraging a (large) language model that generates tactics (i. e. proof steps) to search through proof states.

Wu's Method can Boost Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry

no code yet • 9 Apr 2024

In this note, we revisit the IMO-AG-30 Challenge introduced with AlphaGeometry, and find that Wu's method is surprisingly strong.

Proceedings 12th International Workshop on Theorem proving components for Educational software

no code yet • 4 Apr 2024

The ThEdu series pursues the smooth transition from an intuitive way of doing mathematics at secondary school to a more formal approach to the subject in STEM education, while favouring software support for this transition by exploiting the power of theorem-proving technologies.

Multi-Task Learning with Multi-Task Optimization

no code yet • 24 Mar 2024

Multi-task learning solves multiple correlated tasks.

Enhancing Formal Theorem Proving: A Comprehensive Dataset for Training AI Models on Coq Code

no code yet • 19 Mar 2024

In the realm of formal theorem proving, the Coq proof assistant stands out for its rigorous approach to verifying mathematical assertions and software correctness.

BAIT: Benchmarking (Embedding) Architectures for Interactive Theorem-Proving

no code yet • 6 Mar 2024

We also provide a qualitative analysis, illustrating that improved performance is associated with more semantically-aware embeddings.

Learning Guided Automated Reasoning: A Brief Survey

no code yet • 6 Mar 2024

Automated theorem provers and formal proof assistants are general reasoning systems that are in theory capable of proving arbitrarily hard theorems, thus solving arbitrary problems reducible to mathematics and logical reasoning.

A Categorization of Complexity Classes for Information Retrieval and Synthesis Using Natural Logic

no code yet • 28 Feb 2024

In this paper, we introduce a novel framework for analyzing the complexity of a question answer based on the natural deduction calculus as presented in Prawitz (1965).

EvoGPT-f: An Evolutionary GPT Framework for Benchmarking Formal Math Languages

no code yet • 12 Feb 2024

This paper introduces EvoGPT-f: a novel evolutionary framework for the first systematic quantitative analysis of the differential machine learnability of five formal math corpora (Lean 3, Lean 4, Coq, HOL 4, HOL Light) using four tokenization methods (character, word-level, Byte Pair Encoding and StarCoder tokenizer).

"Task Success" is not Enough: Investigating the Use of Video-Language Models as Behavior Critics for Catching Undesirable Agent Behaviors

no code yet • 6 Feb 2024

Large-scale generative models are shown to be useful for sampling meaningful candidate solutions, yet they often overlook task constraints and user preferences.