Search Results for author: Zhipeng Chen

Found 34 papers, 19 papers with code

R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

no code implementations7 Mar 2025 Huatong Song, Jinhao Jiang, Yingqian Min, Jie Chen, Zhipeng Chen, Wayne Xin Zhao, Lei Fang, Ji-Rong Wen

To address this, we propose \textbf{R1-Searcher}, a novel two-stage outcome-based RL approach designed to enhance the search capabilities of LLMs.

RAG Reinforcement Learning (RL)

An Empirical Study on Eliciting and Improving R1-like Reasoning Models

1 code implementation6 Mar 2025 Zhipeng Chen, Yingqian Min, Beichen Zhang, Jie Chen, Jinhao Jiang, Daixuan Cheng, Wayne Xin Zhao, Zheng Liu, Xu Miao, Yang Lu, Lei Fang, Zhongyuan Wang, Ji-Rong Wen

This approach achieves a remarkable accuracy of 86. 67% with greedy search on AIME 2024, underscoring its effectiveness in enhancing model capabilities.

VersaGen: Unleashing Versatile Visual Control for Text-to-Image Synthesis

1 code implementation16 Dec 2024 Zhipeng Chen, Lan Yang, Yonggang Qi, Honggang Zhang, Kaiyue Pang, Ke Li, Yi-Zhe Song

Despite the rapid advancements in text-to-image (T2I) synthesis, enabling precise visual control remains a significant challenge.

AI Agent Image Generation

Imitate, Explore, and Self-Improve: A Reproduction Report on Slow-thinking Reasoning Systems

3 code implementations12 Dec 2024 Yingqian Min, Zhipeng Chen, Jinhao Jiang, Jie Chen, Jia Deng, Yiwen Hu, Yiru Tang, Jiapeng Wang, Xiaoxue Cheng, Huatong Song, Wayne Xin Zhao, Zheng Liu, Zhongyuan Wang, Ji-Rong Wen

We introduce an ``imitate, explore, and self-improve'' framework, denoted as \textbf{STILL-2}, as our primary technical approach to train the reasoning model.

Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models

no code implementations10 Oct 2024 Zhipeng Chen, Liang Song, Kun Zhou, Wayne Xin Zhao, Bingning Wang, WeiPeng Chen, Ji-Rong Wen

In the extraction stage, we firstly locate key neurons that are highly related to specific abilities, and then employ them to extract the transferable ability-specific weights.

Towards Effective and Efficient Continual Pre-training of Large Language Models

no code implementations26 Jul 2024 Jie Chen, Zhipeng Chen, Jiapeng Wang, Kun Zhou, Yutao Zhu, Jinhao Jiang, Yingqian Min, Wayne Xin Zhao, Zhicheng Dou, Jiaxin Mao, Yankai Lin, Ruihua Song, Jun Xu, Xu Chen, Rui Yan, Zhewei Wei, Di Hu, Wenbing Huang, Ji-Rong Wen

To make the CPT approach more traceable, this paper presents a technical report for continually pre-training Llama-3 (8B), which significantly enhances the Chinese language ability and scientific reasoning ability of the backbone model.

Math

Not Everything is All You Need: Toward Low-Redundant Optimization for Large Language Model Alignment

1 code implementation18 Jun 2024 Zhipeng Chen, Kun Zhou, Wayne Xin Zhao, Jingyuan Wang, Ji-Rong Wen

Concretely, we first identify the neurons that are related to the human preference data by a gradient-based strategy, then identify the alignment-related key tokens by reward models for computing loss.

All Language Modeling +2

Dual-Pipeline with Low-Rank Adaptation for New Language Integration in Multilingual ASR

no code implementations12 Jun 2024 Yerbolat Khassanov, Zhipeng Chen, Tianfeng Chen, Tze Yuang Chong, Wei Li, Jun Zhang, Lu Lu, Yuxuan Wang

This paper addresses challenges in integrating new languages into a pre-trained multilingual automatic speech recognition (mASR) system, particularly in scenarios where training data for existing languages is limited or unavailable.

Automatic Speech Recognition Decoder +2

JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models

1 code implementation23 May 2024 Kun Zhou, Beichen Zhang, Jiapeng Wang, Zhipeng Chen, Wayne Xin Zhao, Jing Sha, Zhichao Sheng, Shijin Wang, Ji-Rong Wen

We leverage it to synthesize 6 million math problems for pre-training our JiuZhang3. 0 model, which only needs to invoke GPT-4 API 9. 3k times and pre-train on 4. 6B data.

Knowledge Distillation Math +1

Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint

1 code implementation11 Jan 2024 Zhipeng Chen, Kun Zhou, Wayne Xin Zhao, Junchen Wan, Fuzheng Zhang, Di Zhang, Ji-Rong Wen

To address it, we propose a new RL method named RLMEC that incorporates a generative model as the reward model, which is trained by the erroneous solution rewriting task under the minimum editing constraint, and can produce token-level rewards for RL training.

Question Answering Reinforcement Learning (RL)

Extraction of n = 0 pick-up by locked mode detectors based on neural networks in J-TEXT

no code implementations23 Nov 2023 Chengshuo Shen, Jianchao Li, Yonghua Ding, Jiaolong Dong, Nengchao Wang, Dongliang. Han, Feiyue Mao, Da Li, Zhipeng Chen, Zhoujun Yang, Zhongyong Chen, Yuan Pan, J-TEXT team

A new method to extract this pick-up has been developed by predicting the n = 0 pick-up brn=0 by the LM detectors based on Neural Networks (NNs) in J-TEXT.

Don't Make Your LLM an Evaluation Benchmark Cheater

no code implementations3 Nov 2023 Kun Zhou, Yutao Zhu, Zhipeng Chen, Wentong Chen, Wayne Xin Zhao, Xu Chen, Yankai Lin, Ji-Rong Wen, Jiawei Han

Large language models~(LLMs) have greatly advanced the frontiers of artificial intelligence, attaining remarkable improvement in model capacity.

ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models

1 code implementation23 May 2023 Zhipeng Chen, Kun Zhou, Beichen Zhang, Zheng Gong, Wayne Xin Zhao, Ji-Rong Wen

Although large language models (LLMs) have achieved excellent performance in a variety of evaluation benchmarks, they still struggle in complex reasoning tasks which require specific knowledge and multi-hop reasoning.

Math

TextBox 2.0: A Text Generation Library with Pre-trained Language Models

1 code implementation26 Dec 2022 Tianyi Tang, Junyi Li, Zhipeng Chen, Yiwen Hu, Zhuohao Yu, Wenxun Dai, Zican Dong, Xiaoxue Cheng, Yuhao Wang, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen

To facilitate research on text generation, this paper presents a comprehensive and unified library, TextBox 2. 0, focusing on the use of pre-trained language models (PLMs).

Abstractive Text Summarization Data-to-Text Generation +7

IDP-PGFE: An Interpretable Disruption Predictor based on Physics-Guided Feature Extraction

no code implementations28 Aug 2022 Chengshuo Shen, Wei Zheng, Yonghua Ding, Xinkun Ai, Fengming Xue, Yu Zhong, Nengchao Wang, Li Gao, Zhipeng Chen, Zhoujun Yang, Zhongyong Chen, Yuan Pan, J-TEXT team

Understanding why a predictor makes a certain prediction can be as crucial as the prediction's accuracy for future tokamak disruption predictors.

Prediction

A Channel Mix Method for Fine-Grained Cross-Modal Retrieval

3 code implementations IEEE International Conference on Multimedia and Expo (ICME) 2022 Yang shen, Xuhao Sun, Xiu-Shen Wei, Hanxu Hu, Zhipeng Chen

In this paper, we propose a simple but effective method for dealing with the challenging fine-grained cross-modal retrieval task where it aims to enable flexible retrieval among subor-dinate categories across different modalities.

Cross-Modal Retrieval Retrieval

TextBox: A Unified, Modularized, and Extensible Framework for Text Generation

1 code implementation ACL 2021 Junyi Li, Tianyi Tang, Gaole He, Jinhao Jiang, Xiaoxuan Hu, Puzhao Xie, Zhipeng Chen, Zhuohao Yu, Wayne Xin Zhao, Ji-Rong Wen

In this paper, we release an open-source library, called TextBox, to provide a unified, modularized, and extensible text generation framework.

Text Generation

Contextual Recurrent Units for Cloze-style Reading Comprehension

no code implementations14 Nov 2019 Yiming Cui, Wei-Nan Zhang, Wanxiang Che, Ting Liu, Zhipeng Chen, Shijin Wang, Guoping Hu

Recurrent Neural Networks (RNN) are known as powerful models for handling sequential data, and especially widely utilized in various natural language processing tasks.

Reading Comprehension Sentence +2

Convolutional Spatial Attention Model for Reading Comprehension with Multiple-Choice Questions

no code implementations21 Nov 2018 Zhipeng Chen, Yiming Cui, Wentao Ma, Shijin Wang, Guoping Hu

Machine Reading Comprehension (MRC) with multiple-choice questions requires the machine to read given passage and select the correct answer among several candidates.

Machine Reading Comprehension Multiple-choice

HFL-RC System at SemEval-2018 Task 11: Hybrid Multi-Aspects Model for Commonsense Reading Comprehension

no code implementations15 Mar 2018 Zhipeng Chen, Yiming Cui, Wentao Ma, Shijin Wang, Ting Liu, Guoping Hu

This paper describes the system which got the state-of-the-art results at SemEval-2018 Task 11: Machine Comprehension using Commonsense Knowledge.

Multiple-choice Reading Comprehension

Secure Detection of Image Manipulation by means of Random Feature Selection

no code implementations2 Feb 2018 Zhipeng Chen, Benedetta Tondi, Xiaolong Li, Rongrong Ni, Yao Zhao, Mauro Barni

We address the problem of data-driven image manipulation detection in the presence of an attacker with limited knowledge about the detector.

Cryptography and Security

Cannot find the paper you are looking for? You can Submit a new open access paper.