Search Results for author: Zhengying Liu

Found 27 papers, 16 papers with code

CombiBench: Benchmarking LLM Capability for Combinatorial Mathematics

1 code implementation6 May 2025 Junqi Liu, Xiaohan Lin, Jonas Bayer, Yael Dillies, Weijie Jiang, Xiaodan Liang, Roman Soletskyi, Haiming Wang, Yunzhou Xie, Beibei Xiong, Zhengfeng Yang, Jujian Zhang, Lihong Zhi, Jia Li, Zhengying Liu

CombiBench is suitable for testing IMO solving capabilities since it includes all IMO combinatorial problems since 2000 (except IMO 2004 P3 as its statement contain an images).

Benchmarking

FormalAlign: Automated Alignment Evaluation for Autoformalization

1 code implementation14 Oct 2024 Jianqiao Lu, Yingjia Wan, Yinya Huang, Jing Xiong, Zhengying Liu, Zhijiang Guo

To address this, we introduce \textsc{FormalAlign}, the first automated framework designed for evaluating the alignment between natural and formal languages in autoformalization.

Mathematical Proofs valid

ToolACE: Winning the Points of LLM Function Calling

no code implementations2 Sep 2024 Weiwen Liu, Xu Huang, Xingshan Zeng, Xinlong Hao, Shuai Yu, Dexun Li, Shuai Wang, Weinan Gan, Zhengying Liu, Yuanqing Yu, Zezhong Wang, Yuxian Wang, Wu Ning, Yutai Hou, Bin Wang, Chuhan Wu, Xinzhi Wang, Yong liu, Yasheng Wang, Duyu Tang, Dandan Tu, Lifeng Shang, Xin Jiang, Ruiming Tang, Defu Lian, Qun Liu, Enhong Chen

Function calling significantly extends the application boundary of large language models, where high-quality and diverse training data is critical for unlocking this capability.

Process-Driven Autoformalization in Lean 4

2 code implementations4 Jun 2024 Jianqiao Lu, Yingjia Wan, Zhengying Liu, Yinya Huang, Jing Xiong, Chengwu Liu, Jianhao Shen, Hui Jin, Jipeng Zhang, Haiming Wang, Zhicheng Yang, Jing Tang, Zhijiang Guo

Autoformalization, the conversion of natural language mathematics into formal languages, offers significant potential for advancing mathematical reasoning.

Mathematical Reasoning

Proving Theorems Recursively

1 code implementation23 May 2024 Haiming Wang, Huajian Xin, Zhengying Liu, Wenda Li, Yinya Huang, Jianqiao Lu, Zhicheng Yang, Jing Tang, Jian Yin, Zhenguo Li, Xiaodan Liang

This approach allows the theorem to be tackled incrementally by outlining the overall theorem at the first level and then solving the intermediate conjectures at deeper levels.

Automated Theorem Proving

ATG: Benchmarking Automated Theorem Generation for Generative Language Models

no code implementations5 May 2024 Xiaohan Lin, Qingxing Cao, Yinya Huang, Zhicheng Yang, Zhengying Liu, Zhenguo Li, Xiaodan Liang

We conduct extensive experiments to investigate whether current LMs can generate theorems in the library and benefit the problem theorems proving.

Automated Theorem Proving Benchmarking

MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data

1 code implementation14 Feb 2024 Yinya Huang, Xiaohan Lin, Zhengying Liu, Qingxing Cao, Huajian Xin, Haiming Wang, Zhenguo Li, Linqi Song, Xiaodan Liang

Recent large language models (LLMs) have witnessed significant advancement in various tasks, including mathematical reasoning and theorem proving.

Automated Theorem Proving Language Modelling +3

Large Language Models as Automated Aligners for benchmarking Vision-Language Models

no code implementations24 Nov 2023 Yuanfeng Ji, Chongjian Ge, Weikai Kong, Enze Xie, Zhengying Liu, Zhengguo Li, Ping Luo

In this work, we address the limitations via Auto-Bench, which delves into exploring LLMs as proficient aligners, measuring the alignment between VLMs and human intelligence and value through automatic data curation and assessment.

Benchmarking World Knowledge

Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis

no code implementations16 Oct 2023 Kai Chen, Chunwei Wang, Kuo Yang, Jianhua Han, Lanqing Hong, Fei Mi, Hang Xu, Zhengying Liu, Wenyong Huang, Zhenguo Li, Dit-yan Yeung, Lifeng Shang, Xin Jiang, Qun Liu

The rapid development of large language models (LLMs) has not only provided numerous opportunities but also presented significant challenges.

Instruction Following

LEGO-Prover: Neural Theorem Proving with Growing Libraries

1 code implementation1 Oct 2023 Haiming Wang, Huajian Xin, Chuanyang Zheng, Lin Li, Zhengying Liu, Qingxing Cao, Yinya Huang, Jing Xiong, Han Shi, Enze Xie, Jian Yin, Zhenguo Li, Heng Liao, Xiaodan Liang

Our ablation study indicates that these newly added skills are indeed helpful for proving theorems, resulting in an improvement from a success rate of 47. 1% to 50. 4%.

 Ranked #1 on Automated Theorem Proving on miniF2F-valid (Pass@100 metric)

Automated Theorem Proving

Lyra: Orchestrating Dual Correction in Automated Theorem Proving

1 code implementation27 Sep 2023 Chuanyang Zheng, Haiming Wang, Enze Xie, Zhengying Liu, Jiankai Sun, Huajian Xin, Jianhao Shen, Zhenguo Li, Yu Li

In addition, we introduce Conjecture Correction, an error feedback mechanism designed to interact with prover to refine formal proof conjectures with prover error messages.

Ranked #2 on Automated Theorem Proving on miniF2F-valid (Pass@100 metric)

Automated Theorem Proving Hallucination

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

1 code implementation21 Sep 2023 Longhui Yu, Weisen Jiang, Han Shi, Jincheng Yu, Zhengying Liu, Yu Zhang, James T. Kwok, Zhenguo Li, Adrian Weller, Weiyang Liu

Our MetaMath-7B model achieves 66. 4% on GSM8K and 19. 4% on MATH, exceeding the state-of-the-art models of the same size by 11. 5% and 8. 7%.

Ranked #60 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +5

FIMO: A Challenge Formal Dataset for Automated Theorem Proving

1 code implementation8 Sep 2023 Chengwu Liu, Jianhao Shen, Huajian Xin, Zhengying Liu, Ye Yuan, Haiming Wang, Wei Ju, Chuanyang Zheng, Yichun Yin, Lin Li, Ming Zhang, Qun Liu

We present FIMO, an innovative dataset comprising formal mathematical problem statements sourced from the International Mathematical Olympiad (IMO) Shortlisted Problems.

Automated Theorem Proving

Forward-Backward Reasoning in Large Language Models for Mathematical Verification

no code implementations15 Aug 2023 Weisen Jiang, Han Shi, Longhui Yu, Zhengying Liu, Yu Zhang, Zhenguo Li, James T. Kwok

Instead of using forward or backward reasoning alone, we propose FOBAR to combine FOrward and BAckward Reasoning for verification.

Mathematical Reasoning

Progressive-Hint Prompting Improves Reasoning in Large Language Models

1 code implementation19 Apr 2023 Chuanyang Zheng, Zhengying Liu, Enze Xie, Zhenguo Li, Yu Li

The performance of Large Language Models (LLMs) in reasoning tasks depends heavily on prompt design, with Chain-of-Thought (CoT) and self-consistency being critical methods that enhance this ability.

Arithmetic Reasoning GSM8K +2

Learning to Prove Trigonometric Identities

no code implementations14 Jul 2022 Zhou Liu, YuJun Li, Zhengying Liu, Lin Li, Zhenguo Li

We define the normalized form of trigonometric identities, design a set of rules for the proof and put forward a method which can generate theoretically infinite trigonometric identities.

Automated Theorem Proving Imitation Learning

Advances in MetaDL: AAAI 2021 challenge and workshop

no code implementations1 Feb 2022 Adrian El Baz, Isabelle Guyon, Zhengying Liu, Jan van Rijn, Sebastien Treguer, Joaquin Vanschoren

Winning methods featured various classifiers trained on top of the second last layer of popular CNN backbones, fined-tuned on the meta-training data (not necessarily in an episodic manner), then trained on the labeled support and tested on the unlabeled query sets of the meta-test data.

Few-Shot Learning

Deep Statistical Solvers

1 code implementation NeurIPS 2020 Balthazar Donon, Zhengying Liu, Wenzhuo LIU, Isabelle Guyon, Antoine Marot, Marc Schoenauer

This paper introduces Deep Statistical Solvers (DSS), a new class of trainable solvers for optimization problems, arising e. g., from system simulations.

AgEBO-Tabular: Joint Neural Architecture and Hyperparameter Search with Autotuned Data-Parallel Training for Tabular Data

no code implementations30 Oct 2020 Romain Egele, Prasanna Balaprakash, Venkatram Vishwanath, Isabelle Guyon, Zhengying Liu

Neural architecture search (NAS) is an AutoML approach that generates and evaluates multiple neural network architectures concurrently and improves the accuracy of the generated models iteratively.

Bayesian Optimization Neural Architecture Search

LEAP nets for power grid perturbations

1 code implementation22 Aug 2019 Benjamin Donnot, Balthazar Donon, Isabelle Guyon, Zhengying Liu, Antoine Marot, Patrick Panciatici, Marc Schoenauer

We propose a novel neural network embedding approach to model power transmission grids, in which high voltage lines are disconnected and reconnected with one-another from time to time, either accidentally or willfully.

Network Embedding Transfer Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.