Search Results for author: Jinhao Jiang

Found 22 papers, 12 papers with code

CAFE: Retrieval Head-based Coarse-to-Fine Information Seeking to Enhance Multi-Document QA Capability

no code implementations15 May 2025 Han Peng, Jinhao Jiang, Zican Dong, Wayne Xin Zhao, Lei Fang

Advancements in Large Language Models (LLMs) have extended their input context length, yet they still struggle with retrieval and reasoning in long-context inputs.

Question Answering RAG +1

R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

1 code implementation7 Mar 2025 Huatong Song, Jinhao Jiang, Yingqian Min, Jie Chen, Zhipeng Chen, Wayne Xin Zhao, Lei Fang, Ji-Rong Wen

To address this, we propose \textbf{R1-Searcher}, a novel two-stage outcome-based RL approach designed to enhance the search capabilities of LLMs.

RAG Reinforcement Learning (RL)

An Empirical Study on Eliciting and Improving R1-like Reasoning Models

1 code implementation6 Mar 2025 Zhipeng Chen, Yingqian Min, Beichen Zhang, Jie Chen, Jinhao Jiang, Daixuan Cheng, Wayne Xin Zhao, Zheng Liu, Xu Miao, Yang Lu, Lei Fang, Zhongyuan Wang, Ji-Rong Wen

This approach achieves a remarkable accuracy of 86. 67% with greedy search on AIME 2024, underscoring its effectiveness in enhancing model capabilities.

LongReD: Mitigating Short-Text Degradation of Long-Context Large Language Models via Restoration Distillation

no code implementations11 Feb 2025 Zican Dong, Junyi Li, Jinhao Jiang, Mingyu Xu, Wayne Xin Zhao, Bingning Wang, WeiPeng Chen

To address these challenges, we propose Long Context Pre-training with Restoration Distillation (LongReD), a novel approach designed to mitigate short-text performance degradation through minimizing the distribution discrepancy between the extended and original models.

Holistically Guided Monte Carlo Tree Search for Intricate Information Seeking

no code implementations7 Feb 2025 Ruiyang Ren, Yuhao Wang, Junyi Li, Jinhao Jiang, Wayne Xin Zhao, Wenjie Wang, Tat-Seng Chua

We reformulate the task as a progressive information collection process with a knowledge memory and unite an adaptive checklist with multi-perspective reward modeling in MCTS.

YuLan-Mini: An Open Data-efficient Language Model

2 code implementations23 Dec 2024 Yiwen Hu, Huatong Song, Jia Deng, Jiapeng Wang, Jie Chen, Kun Zhou, Yutao Zhu, Jinhao Jiang, Zican Dong, Wayne Xin Zhao, Ji-Rong Wen

Effective pre-training of large language models (LLMs) has been challenging due to the immense resource demands and the complexity of the technical processes involved.

Language Modeling Language Modelling +1

RAG-Star: Enhancing Deliberative Reasoning with Retrieval Augmented Verification and Refinement

no code implementations17 Dec 2024 Jinhao Jiang, Jiayi Chen, Junyi Li, Ruiyang Ren, Shijie Wang, Wayne Xin Zhao, Yang song, Tao Zhang

Existing large language models (LLMs) show exceptional problem-solving capabilities but might struggle with complex reasoning tasks.

RAG Retrieval

Imitate, Explore, and Self-Improve: A Reproduction Report on Slow-thinking Reasoning Systems

3 code implementations12 Dec 2024 Yingqian Min, Zhipeng Chen, Jinhao Jiang, Jie Chen, Jia Deng, Yiwen Hu, Yiru Tang, Jiapeng Wang, Xiaoxue Cheng, Huatong Song, Wayne Xin Zhao, Zheng Liu, Zhongyuan Wang, Ji-Rong Wen

We introduce an ``imitate, explore, and self-improve'' framework, denoted as \textbf{STILL-2}, as our primary technical approach to train the reasoning model.

Towards Effective and Efficient Continual Pre-training of Large Language Models

no code implementations26 Jul 2024 Jie Chen, Zhipeng Chen, Jiapeng Wang, Kun Zhou, Yutao Zhu, Jinhao Jiang, Yingqian Min, Wayne Xin Zhao, Zhicheng Dou, Jiaxin Mao, Yankai Lin, Ruihua Song, Jun Xu, Xu Chen, Rui Yan, Zhewei Wei, Di Hu, Wenbing Huang, Ji-Rong Wen

To make the CPT approach more traceable, this paper presents a technical report for continually pre-training Llama-3 (8B), which significantly enhances the Chinese language ability and scientific reasoning ability of the backbone model.

Math

Mix-CPT: A Domain Adaptation Framework via Decoupling Knowledge Learning and Format Alignment

no code implementations15 Jul 2024 Jinhao Jiang, Junyi Li, Wayne Xin Zhao, Yang song, Tao Zhang, Ji-Rong Wen

However, this method may result in inefficient knowledge memorization due to a lack of awareness of knowledge utilization and imposes substantial demands on LLMs to simultaneously learn knowledge utilization and format alignment with limited training samples.

Domain Adaptation Memorization

KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning over Knowledge Graph

no code implementations17 Feb 2024 Jinhao Jiang, Kun Zhou, Wayne Xin Zhao, Yang song, Chen Zhu, HengShu Zhu, Ji-Rong Wen

To guarantee the effectiveness, we leverage program language to formulate the multi-hop reasoning process over the KG, and synthesize a code-based instruction dataset to fine-tune the base LLM.

Knowledge Graphs

ReasoningLM: Enabling Structural Subgraph Reasoning in Pre-trained Language Models for Question Answering over Knowledge Graph

no code implementations30 Dec 2023 Jinhao Jiang, Kun Zhou, Wayne Xin Zhao, Yaliang Li, Ji-Rong Wen

To better perform reasoning on KG, recent work typically adopts a pre-trained language model~(PLM) to model the question, and a graph neural network~(GNN) based module to perform multi-hop reasoning on the KG.

Graph Neural Network Language Modelling +1

StructGPT: A General Framework for Large Language Model to Reason over Structured Data

1 code implementation16 May 2023 Jinhao Jiang, Kun Zhou, Zican Dong, Keming Ye, Wayne Xin Zhao, Ji-Rong Wen

Specially, we propose an \emph{invoking-linearization-generation} procedure to support LLMs in reasoning on the structured data with the help of the external interfaces.

Language Modeling Language Modelling +2

UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question Answering Over Knowledge Graph

1 code implementation2 Dec 2022 Jinhao Jiang, Kun Zhou, Wayne Xin Zhao, Ji-Rong Wen

Multi-hop Question Answering over Knowledge Graph~(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question on a large-scale Knowledge Graph (KG).

Language Modelling Multi-hop Question Answering +2

TextBox: A Unified, Modularized, and Extensible Framework for Text Generation

1 code implementation ACL 2021 Junyi Li, Tianyi Tang, Gaole He, Jinhao Jiang, Xiaoxuan Hu, Puzhao Xie, Zhipeng Chen, Zhuohao Yu, Wayne Xin Zhao, Ji-Rong Wen

In this paper, we release an open-source library, called TextBox, to provide a unified, modularized, and extensible text generation framework.

Text Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.