TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Automated Theorem Proving	Metamath set.mm	Evariste	Pass@32	72.4	# 1
Automated Theorem Proving	miniF2F-curriculum	GPT-f	Pass@64	30.6	# 4
Automated Theorem Proving	miniF2F-curriculum	Evariste	Pass@64	32.1	# 3
Automated Theorem Proving	miniF2F-curriculum	Evariste-7d	Pass@64	42.5	# 1
Automated Theorem Proving	miniF2F-curriculum	Evariste-1d	Pass@64	33.6	# 2
Automated Theorem Proving	miniF2F-test	Evariste	Pass@64	41	# 1
Automated Theorem Proving	miniF2F-test	Evariste-7d	Pass@64	40.6	# 2
Automated Theorem Proving	miniF2F-test	Evariste-1d	Pass@64	38.9	# 3
Automated Theorem Proving	miniF2F-test	GPT-f	Pass@64	36.6	# 4
Automated Theorem Proving	miniF2F-valid	Evariste-7d	Pass@64	47.5	# 2
Automated Theorem Proving	miniF2F-valid	Evariste-1d	Pass@64	46.7	# 4
Automated Theorem Proving	miniF2F-valid	GPT-f	Pass@64	47.3	# 3
Automated Theorem Proving	miniF2F-valid	Evariste	Pass@64	58.6	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hypertree-proof-search-for-neural-theorem/automated-theorem-proving-on-metamath-setmm)](https://paperswithcode.com/sota/automated-theorem-proving-on-metamath-setmm?p=hypertree-proof-search-for-neural-theorem)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hypertree-proof-search-for-neural-theorem/automated-theorem-proving-on-minif2f-1)](https://paperswithcode.com/sota/automated-theorem-proving-on-minif2f-1?p=hypertree-proof-search-for-neural-theorem)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hypertree-proof-search-for-neural-theorem/automated-theorem-proving-on-minif2f-test)](https://paperswithcode.com/sota/automated-theorem-proving-on-minif2f-test?p=hypertree-proof-search-for-neural-theorem)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hypertree-proof-search-for-neural-theorem/automated-theorem-proving-on-minif2f-valid)](https://paperswithcode.com/sota/automated-theorem-proving-on-minif2f-valid?p=hypertree-proof-search-for-neural-theorem)`

HyperTree Proof Search for Neural Theorem Proving

23 May 2022 · Guillaume Lample, Marie-Anne Lachaux, Thibaut Lavril, Xavier Martinet, Amaury Hayat, Gabriel Ebner, Aurélien Rodriguez, Timothée Lacroix ·

We propose an online training procedure for a transformer-based automated theorem prover. Our approach leverages a new search algorithm, HyperTree Proof Search (HTPS), inspired by the recent success of AlphaZero. Our model learns from previous proof searches through online training, allowing it to generalize to domains far from the training distribution. We report detailed ablations of our pipeline's main components by studying performance on three environments of increasing complexity. In particular, we show that with HTPS alone, a model trained on annotated proofs manages to prove 65.4% of a held-out set of Metamath theorems, significantly outperforming the previous state of the art of 56.5% by GPT-f. Online training on these unproved theorems increases accuracy to 82.6%. With a similar computational budget, we improve the state of the art on the Lean-based miniF2F-curriculum dataset from 31% to 42% proving accuracy.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Automated Theorem Proving

Datasets

MiniF2F

Results from the Paper

Edit

Ranked #1 on Automated Theorem Proving on Metamath set.mm (Pass@32 metric)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Automated Theorem Proving	Metamath set.mm	Evariste	Pass@32	72.4	# 1	Compare
Automated Theorem Proving	miniF2F-curriculum	GPT-f	Pass@64	30.6	# 4	Compare
Automated Theorem Proving	miniF2F-curriculum	Evariste	Pass@64	32.1	# 3	Compare
Automated Theorem Proving	miniF2F-curriculum	Evariste-7d	Pass@64	42.5	# 1	Compare
Automated Theorem Proving	miniF2F-curriculum	Evariste-1d	Pass@64	33.6	# 2	Compare
Automated Theorem Proving	miniF2F-test	Evariste	Pass@64	41	# 1	Compare
Automated Theorem Proving	miniF2F-test	Evariste-7d	Pass@64	40.6	# 2	Compare
Automated Theorem Proving	miniF2F-test	Evariste-1d	Pass@64	38.9	# 3	Compare
Automated Theorem Proving	miniF2F-test	GPT-f	Pass@64	36.6	# 4	Compare
Automated Theorem Proving	miniF2F-valid	Evariste-7d	Pass@64	47.5	# 2	Compare
Automated Theorem Proving	miniF2F-valid	Evariste-1d	Pass@64	46.7	# 4	Compare
Automated Theorem Proving	miniF2F-valid	GPT-f	Pass@64	47.3	# 3	Compare
Automated Theorem Proving	miniF2F-valid	Evariste	Pass@64	58.6	# 1	Compare

Methods

Add Remove

AlphaZero

Edit Social Preview

HyperTree Proof Search for Neural Theorem Proving

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove