Search Results for author: Keming Lu

Found 34 papers, 26 papers with code

MARGE: Improving Math Reasoning for LLMs with Guided Exploration

1 code implementation18 May 2025 Jingyue Gao, Runji Lin, Keming Lu, Bowen Yu, Junyang Lin, Jianyu Chen

These results demonstrate MARGE's effectiveness in enhancing mathematical reasoning capabilities and unlocking the potential of scaling self-generated training data.

Math Mathematical Reasoning

WorldPM: Scaling Human Preference Modeling

1 code implementation15 May 2025 Binghai Wang, Runji Lin, Keming Lu, Le Yu, Zhenru Zhang, Fei Huang, Chujie Zheng, Kai Dang, Yang Fan, Xingzhang Ren, An Yang, Binyuan Hui, Dayiheng Liu, Tao Gui, Qi Zhang, Xuanjing Huang, Yu-Gang Jiang, Bowen Yu, Jingren Zhou, Junyang Lin

Motivated by scaling laws in language modeling that demonstrate how test loss scales as a power law with model and dataset sizes, we find that similar laws exist in preference modeling.

Language Modeling Language Modelling

AutoLogi: Automated Generation of Logic Puzzles for Evaluating Reasoning Abilities of Large Language Models

1 code implementation24 Feb 2025 Qin Zhu, Fei Huang, Runyu Peng, Keming Lu, Bowen Yu, Qinyuan Cheng, Xipeng Qiu, Xuanjing Huang, Junyang Lin

While logical reasoning evaluation of Large Language Models (LLMs) has attracted significant attention, existing benchmarks predominantly rely on multiple-choice formats that are vulnerable to random guessing, leading to overestimated performance and substantial performance fluctuations.

Logical Reasoning Multiple-choice

ProcessBench: Identifying Process Errors in Mathematical Reasoning

1 code implementation9 Dec 2024 Chujie Zheng, Zhenru Zhang, Beichen Zhang, Runji Lin, Keming Lu, Bowen Yu, Dayiheng Liu, Jingren Zhou, Junyang Lin

We conduct extensive evaluation on ProcessBench, involving two types of models: process reward models (PRMs) and critic models, where for the latter we prompt general language models to critique each solution step by step.

GSM8K Math +1

Aligning Large Language Models via Self-Steering Optimization

1 code implementation22 Oct 2024 Hao Xiang, Bowen Yu, Hongyu Lin, Keming Lu, Yaojie Lu, Xianpei Han, Le Sun, Jingren Zhou, Junyang Lin

The key to automated alignment lies in providing learnable and accurate preference signals for preference learning without human annotation.

A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models

no code implementations17 Oct 2024 Qiaoyu Tang, Le Yu, Bowen Yu, Hongyu Lin, Keming Lu, Yaojie Lu, Xianpei Han, Le Sun

Post-training has emerged as a crucial paradigm for adapting large-scale pre-trained models to various tasks, whose effects are fully reflected by delta parameters (i. e., the disparity between post-trained and pre-trained parameters).

Quantization

LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback

1 code implementation20 Jun 2024 Bofei Gao, Zefan Cai, Runxin Xu, Peiyi Wang, Ce Zheng, Runji Lin, Keming Lu, Dayiheng Liu, Chang Zhou, Wen Xiao, Junjie Hu, Tianyu Liu, Baobao Chang

In recent progress, mathematical verifiers have achieved success in mathematical reasoning tasks by validating the correctness of solutions generated by policy models.

Binary Classification GSM8K +2

Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models

1 code implementation19 Jun 2024 Guanting Dong, Keming Lu, Chengpeng Li, Tingyu Xia, Bowen Yu, Chang Zhou, Jingren Zhou

AutoIF transforms the validation of instruction-following data quality into code verification, requiring LLMs to generate instructions, the corresponding code to check the correctness of the instruction responses, and unit test samples to verify the code's correctness.

Instruction Following

PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling

2 code implementations4 Jun 2024 Zefan Cai, Yichi Zhang, Bofei Gao, Yuliang Liu, Yucheng Li, Tianyu Liu, Keming Lu, Wayne Xiong, Yue Dong, Junjie Hu, Wen Xiao

Our experimental evaluations, utilizing the LongBench benchmark, show that PyramidKV matches the performance of models with a full KV cache while retaining only 12% of the KV cache, thus significantly reducing memory usage.

Towards Scalable Automated Alignment of LLMs: A Survey

1 code implementation3 Jun 2024 Boxi Cao, Keming Lu, Xinyu Lu, Jiawei Chen, Mengjie Ren, Hao Xiang, Peilin Liu, Yaojie Lu, Ben He, Xianpei Han, Le Sun, Hongyu Lin, Bowen Yu

Alignment is the most critical step in building large language models (LLMs) that meet human needs.

Survey

Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment

1 code implementation28 May 2024 Keming Lu, Bowen Yu, Fei Huang, Yang Fan, Runji Lin, Chang Zhou

Effectively aligning Large Language Models (LLMs) with human-centric values while preventing the degradation of abilities acquired through Pre-training and Supervised Fine-tuning (SFT) poses a central challenge in Reinforcement Learning from Human Feedback (RLHF).

Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment

1 code implementation23 Jan 2024 Keming Lu, Bowen Yu, Chang Zhou, Jingren Zhou

Nevertheless, we posit that LLMs inherently harbor role-play capabilities, owing to the extensive knowledge of characters and potential dialogues ingrained in their vast training corpora.

All Instruction Following +1

Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models

no code implementations15 Nov 2023 Keming Lu, Hongyi Yuan, Runji Lin, Junyang Lin, Zheng Yuan, Chang Zhou, Jingren Zhou

Zooter shows computation efficiency in inference as it introduces only a minor computation overhead of a routing function compared with reward model ranking methods.

TAG

Speculative Contrastive Decoding

no code implementations15 Nov 2023 Hongyi Yuan, Keming Lu, Fei Huang, Zheng Yuan, Chang Zhou

Large language models~(LLMs) exhibit exceptional performance in language tasks, yet their auto-regressive inference is limited due to high computational requirements and is sub-optimal due to the exposure bias.

Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning

1 code implementation14 Nov 2023 Shengguang Wu, Keming Lu, Benfeng Xu, Junyang Lin, Qi Su, Chang Zhou

The key to our data sampling technique lies in the enhancement of diversity in the chosen subsets, as the model selects new data points most distinct from any existing ones according to its current embedding space.

Diversity Instruction Following

MuggleMath: Assessing the Impact of Query and Response Augmentation on Math Reasoning

1 code implementation9 Oct 2023 Chengpeng Li, Zheng Yuan, Hongyi Yuan, Guanting Dong, Keming Lu, Jiancan Wu, Chuanqi Tan, Xiang Wang, Chang Zhou

In this paper, we conduct an investigation for such data augmentation in math reasoning and are intended to answer: (1) What strategies of data augmentation are more effective; (2) What is the scaling relationship between the amount of augmented data and model performance; and (3) Can data augmentation incentivize generalization to out-of-domain mathematical reasoning tasks?

Ranked #60 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning Data Augmentation +3

How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition

2 code implementations9 Oct 2023 Guanting Dong, Hongyi Yuan, Keming Lu, Chengpeng Li, Mingfeng Xue, Dayiheng Liu, Wei Wang, Zheng Yuan, Chang Zhou, Jingren Zhou

We propose four intriguing research questions to explore the association between model performance and various factors including data amount, composition ratio, model size and SFT strategies.

Code Generation Instruction Following +2

#InsTag: Instruction Tagging for Analyzing Supervised Fine-tuning of Large Language Models

1 code implementation14 Aug 2023 Keming Lu, Hongyi Yuan, Zheng Yuan, Runji Lin, Junyang Lin, Chuanqi Tan, Chang Zhou, Jingren Zhou

Based on this observation, we propose a data selector based on InsTag to select 6K diverse and complex samples from open-source datasets and fine-tune models on InsTag-selected data.

Diversity Instruction Following +1

Scaling Relationship on Learning Mathematical Reasoning with Large Language Models

1 code implementation3 Aug 2023 Zheng Yuan, Hongyi Yuan, Chengpeng Li, Guanting Dong, Keming Lu, Chuanqi Tan, Chang Zhou, Jingren Zhou

We find with augmented samples containing more distinct reasoning paths, RFT improves mathematical reasoning performance more for LLMs.

Ranked #111 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +1

PIVOINE: Instruction Tuning for Open-world Information Extraction

1 code implementation24 May 2023 Keming Lu, Xiaoman Pan, Kaiqiang Song, Hongming Zhang, Dong Yu, Jianshu Chen

In particular, we construct INSTRUCTOPENWIKI, a substantial instruction tuning dataset for Open-world IE enriched with a comprehensive corpus, extensive annotations, and diverse instructions.

Instruction Following Language Modeling +2

Exploring Partial Knowledge Base Inference in Biomedical Entity Linking

1 code implementation18 Mar 2023 Hongyi Yuan, Keming Lu, Zheng Yuan

Biomedical entity linking (EL) consists of named entity recognition (NER) and named entity disambiguation (NED).

Entity Disambiguation Entity Linking +3

Multi-hop Evidence Retrieval for Cross-document Relation Extraction

1 code implementation21 Dec 2022 Keming Lu, I-Hung Hsu, Wenxuan Zhou, Mingyu Derek Ma, Muhao Chen

Relation Extraction (RE) has been extended to cross-document scenarios because many relations are not simply described in a single document.

Relation Relation Extraction +1

Knowledge-Enhanced Relation Extraction Dataset

no code implementations19 Oct 2022 Yucong Lin, Hongming Xiao, Jiani Liu, Zichao Lin, Keming Lu, Feifei Wang, Wei Wei

Recently, knowledge-enhanced methods leveraging auxiliary knowledge graphs have emerged in relation extraction, surpassing traditional text-based approaches.

Entity Linking Knowledge Graphs +3

Summarization as Indirect Supervision for Relation Extraction

1 code implementation19 May 2022 Keming Lu, I-Hung Hsu, Wenxuan Zhou, Mingyu Derek Ma, Muhao Chen

Considering that summarization tasks aim at acquiring concise expressions of synoptical information from the longer context, these tasks naturally align with the objective of RE, i. e., extracting a kind of synoptical information that describes the relation of entity mentions.

Relation Relation Extraction +1

BIOS: An Algorithmically Generated Biomedical Knowledge Graph

no code implementations18 Mar 2022 Sheng Yu, Zheng Yuan, Jun Xia, Shengxuan Luo, Huaiyuan Ying, Sihang Zeng, Jingyi Ren, Hongyi Yuan, Zhengyun Zhao, Yucong Lin, Keming Lu, Jing Wang, Yutao Xie, Heung-Yeung Shum

For decades, these knowledge graphs have been developed via expert curation; however, this method can no longer keep up with today's AI development, and a transition to algorithmically generated BioMedKGs is necessary.

BIG-bench Machine Learning Knowledge Graphs +3

Multimodal Learning on Graphs for Disease Relation Extraction

1 code implementation16 Mar 2022 Yucong Lin, Keming Lu, Sheng Yu, Tianxi Cai, Marinka Zitnik

On a dataset annotated by human experts, REMAP improves text-based disease relation extraction by 10. 0% (accuracy) and 17. 2% (F1-score) by fusing disease knowledge graphs with text information.

Knowledge Graphs Relation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.