Search Results for author: Jian Xie

Found 33 papers, 15 papers with code

Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge

no code implementations26 Jun 2025 Boyu Gou, Zanming Huang, Yuting Ning, Yu Gu, Michael Lin, Weijian Qi, Andrei Kopanev, Botao Yu, Bernal Jiménez Gutiérrez, Yiheng Shu, Chan Hee Song, Jiaman Wu, Shijie Chen, Hanane Nour Moussa, Tianshu Zhang, Jian Xie, Yifei Li, Tianci Xue, Zeyi Liao, Kai Zhang, Boyuan Zheng, Zhaowei Cai, Viktor Rozgic, Morteza Ziyadi, Huan Sun, Yu Su

Agentic search such as Deep Research systems, where large language models autonomously browse the web, synthesize information, and return comprehensive citation-backed answers, represents a major shift in how users interact with web-scale information.

Benchmarking

Efficient Medical VIE via Reinforcement Learning

no code implementations16 Jun 2025 Lijun Liu, Ruiyang Li, Zhaocheng Liu, Chenglin Zhu, Chong Li, Jiehan Cheng, Qiang Ju, Jian Xie

Fine-tuning Qwen2. 5-VL-7B with our RLVR method, we achieve state-of-the-art performance on medical VIE tasks, significantly improving F1, precision, and recall.

Diversity Optical Character Recognition (OCR) +2

ARM: Adaptive Reasoning Model

no code implementations26 May 2025 Siye Wu, Jian Xie, Yikai Zhang, Aili Chen, Kai Zhang, Yu Su, Yanghua Xiao

In this work, we propose Adaptive Reasoning Model (ARM), a reasoning model capable of adaptively selecting appropriate reasoning formats based on the task at hand.

model

SIFT-50M: A Large-Scale Multilingual Dataset for Speech Instruction Fine-Tuning

no code implementations12 Apr 2025 Prabhat Pandey, Rupak Vignesh Swaminathan, K V Vijay Girish, Arunasish Sen, Jian Xie, Grant P. Strimel, Andreas Schwarz

We introduce SIFT (Speech Instruction Fine-Tuning), a 50M-example dataset designed for instruction fine-tuning and pre-training of speech-text large language models (LLMs).

Instruction Following

UniEDU: A Unified Language and Vision Assistant for Education Applications

no code implementations26 Mar 2025 Zhendong Chu, Jian Xie, Shen Wang, Zichao Wang, Qingsong Wen

Education materials for K-12 students often consist of multiple modalities, such as text and images, posing challenges for models to fully understand nuanced information in these materials.

Knowledge Tracing

LLM Agents for Education: Advances and Applications

no code implementations14 Mar 2025 Zhendong Chu, Shen Wang, Jian Xie, Tinghui Zhu, Yibo Yan, Jinheng Ye, Aoxiao Zhong, Xuming Hu, Jing Liang, Philip S. Yu, Qingsong Wen

Large Language Model (LLM) agents have demonstrated remarkable capabilities in automating tasks and driving innovation across diverse educational applications.

Fairness Hallucination +4

Implicit Reasoning in Transformers is Reasoning through Shortcuts

1 code implementation10 Mar 2025 Tianhe Lin, Jian Xie, Siyu Yuan, Deqing Yang

Test-time compute is emerging as a new paradigm for enhancing language models' complex multi-step reasoning capabilities, as demonstrated by the success of OpenAI's o1 and o3, as well as DeepSeek's R1.

Mathematical Reasoning

Exploring the Inquiry-Diagnosis Relationship with Advanced Patient Simulators

1 code implementation16 Jan 2025 Zhaocheng Liu, Quan Tu, Wen Ye, Yu Xiao, Zhishou Zhang, Hengfu Cui, Yalun Zhu, Qiang Ju, Shizheng Li, Jian Xie

By inputting medical records into our patient simulator to simulate patient responses, we conduct extensive experiments to explore the relationship between "inquiry" and "diagnosis" in the consultation process.

Diagnostic Sequential Decision Making

Baichuan4-Finance Technical Report

no code implementations17 Dec 2024 Hanyu Zhang, Boyu Qiu, Yuhao Feng, Shuqi Li, Qian Ma, Xiyuan Zhang, Qiang Ju, Dong Yan, Jian Xie

Large language models (LLMs) have demonstrated strong capabilities in language understanding, generation, and reasoning, yet their potential in finance remains underexplored due to the complexity and specialization of financial knowledge.

AAAR-1.0: Assessing AI's Potential to Assist Research

no code implementations29 Oct 2024 Renze Lou, Hanzi Xu, Sijia Wang, Jiangshu Du, Ryo Kamoi, Xiaoxin Lu, Jian Xie, Yuxuan Sun, Yusen Zhang, Jihyun Janice Ahn, Hongchao Fang, Zhuoyang Zou, Wenchao Ma, Xi Li, Kai Zhang, Congying Xia, Lifu Huang, Wenpeng Yin

Numerous studies have assessed the proficiency of AI systems, particularly large language models (LLMs), in facilitating everyday tasks such as email writing, question answering, and creative content generation.

Question Answering

Revealing the Barriers of Language Agents in Planning

1 code implementation16 Oct 2024 Jian Xie, Kexun Zhang, Jiangjie Chen, Siyu Yuan, Kai Zhang, Yikai Zhang, Lei LI, Yanghua Xiao

Although existing studies have highlighted weak performance in agent planning, the deeper underlying issues and the mechanisms and limitations of the strategies proposed to address them remain insufficiently understood.

Boosting Deductive Reasoning with Step Signals In RLHF

no code implementations12 Oct 2024 Jialian Li, Yipin Zhang, Wei Shen, Yuzi Yan, Jian Xie, Dong Yan

Logical reasoning is a crucial task for Large Language Models (LLMs), enabling them to tackle complex problems.

Formal Logic Logical Reasoning

Uncertainty-aware Reward Model: Teaching Reward Models to Know What is Unknown

no code implementations1 Oct 2024 Xingzhou Lou, Dong Yan, Wei Shen, Yuzi Yan, Jian Xie, Junge Zhang

Reward models (RM) play a critical role in aligning generations of large language models (LLM) to human expectations.

Uncertainty Quantification

Reward-Robust RLHF in LLMs

no code implementations18 Sep 2024 Yuzi Yan, Xingzhou Lou, Jialian Li, Yiping Zhang, Jian Xie, Chao Yu, Yu Wang, Dong Yan, Yuan Shen

As Large Language Models (LLMs) continue to progress toward more advanced forms of intelligence, Reinforcement Learning from Human Feedback (RLHF) is increasingly seen as a key pathway toward achieving Artificial General Intelligence (AGI).

Sibyl: Simple yet Effective Agent Framework for Complex Real-world Reasoning

1 code implementation15 Jul 2024 Yulong Wang, Tianhao Shen, Lifeng Liu, Jian Xie

To address these limitations, we introduce Sibyl, a simple yet powerful LLM-based agent framework designed to tackle complex reasoning tasks by efficiently leveraging a minimal set of tools.

In-Context Learning

Flooding Spread of Manipulated Knowledge in LLM-Based Multi-Agent Communities

1 code implementation10 Jul 2024 Tianjie Ju, Yiting Wang, Xinbei Ma, Pengzhou Cheng, Haodong Zhao, Yulong Wang, Lifeng Liu, Jian Xie, Zhuosheng Zhang, Gongshen Liu

The rapid adoption of large language models (LLMs) in multi-agent systems has highlighted their impressive capabilities in various applications, such as collaborative problem-solving and autonomous negotiation.

counterfactual Fact Checking +3

3D-Properties: Identifying Challenges in DPO and Charting a Path Forward

no code implementations11 Jun 2024 Yuzi Yan, Yibo Miao, Jialian Li, Yipin Zhang, Jian Xie, Zhijie Deng, Dong Yan

Aligning large language models (LLMs) with human preference has recently gained tremendous attention, with the canonical yet costly RLHF-PPO and the simple and straightforward Direct Preference Optimization (DPO) as two examples.

Instruction Following Mathematical Problem-Solving

SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling

2 code implementations21 May 2024 Xingzhou Lou, Junge Zhang, Jian Xie, Lifeng Liu, Dong Yan, Kaiqi Huang

Human preference alignment is critical in building powerful and reliable large language models (LLMs).

From Persona to Personalization: A Survey on Role-Playing Language Agents

no code implementations28 Apr 2024 Jiangjie Chen, Xintao Wang, Rui Xu, Siyu Yuan, Yikai Zhang, Wei Shi, Jian Xie, Shuang Li, Ruihan Yang, Tinghui Zhu, Aili Chen, Nianqi Li, Lida Chen, Caiyu Hu, Siye Wu, Scott Ren, Ziquan Fu, Yanghua Xiao

Through this work, we aim to establish a clear taxonomy of RPLA research and applications, and facilitate future research in this critical and ever-evolving field, and pave the way for a future where humans and RPLAs coexist in harmony.

In-Context Learning Instruction Following

How Easily do Irrelevant Inputs Skew the Responses of Large Language Models?

1 code implementation4 Apr 2024 Siye Wu, Jian Xie, Jiangjie Chen, Tinghui Zhu, Kai Zhang, Yanghua Xiao

By leveraging the retrieval of information from external knowledge databases, Large Language Models (LLMs) exhibit enhanced capabilities for accomplishing many knowledge-intensive tasks.

Retrieval

Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models

1 code implementation28 Mar 2024 Ang Lv, Yuhan Chen, Kaiyi Zhang, Yulong Wang, Lifeng Liu, Ji-Rong Wen, Jian Xie, Rui Yan

In this paper, we delve into several mechanisms employed by Transformer-based language models (LLMs) for factual recall tasks.

TravelPlanner: A Benchmark for Real-World Planning with Language Agents

2 code implementations2 Feb 2024 Jian Xie, Kai Zhang, Jiangjie Chen, Tinghui Zhu, Renze Lou, Yuandong Tian, Yanghua Xiao, Yu Su

Are these language agents capable of planning in more complex settings that are out of the reach of prior AI agents?

Deductive Beam Search: Decoding Deducible Rationale for Chain-of-Thought Reasoning

1 code implementation31 Jan 2024 Tinghui Zhu, Kai Zhang, Jian Xie, Yu Su

Recent advancements have significantly augmented the reasoning capabilities of Large Language Models (LLMs) through various methodologies, especially chain-of-thought (CoT) reasoning.

MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following

no code implementations5 Dec 2023 Renze Lou, Kai Zhang, Jian Xie, Yuxuan Sun, Janice Ahn, Hanzi Xu, Yu Su, Wenpeng Yin

In the realm of large language models (LLMs), enhancing instruction-following capability often involves curating expansive training data.

Instruction Following

Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts

2 code implementations22 May 2023 Jian Xie, Kai Zhang, Jiangjie Chen, Renze Lou, Yu Su

By providing external information to large language models (LLMs), tool augmentation (including retrieval augmentation) has emerged as a promising solution for addressing the limitations of LLMs' static parametric memory.

Retrieval

Dialogue State Distillation Network with Inter-slot Contrastive Learning for Dialogue State Tracking

no code implementations16 Feb 2023 Jing Xu, Dandan song, Chong Liu, Siu Cheung Hui, Fei Li, Qiang Ju, Xiaonan He, Jian Xie

In this paper, we propose a Dialogue State Distillation Network (DSDN) to utilize relevant information of previous dialogue states and migrate the gap of utilization between training and testing.

Contrastive Learning Dialogue State Tracking +1

A Transformer-Based User Satisfaction Prediction for Proactive Interaction Mechanism in DuerOS

no code implementations5 Dec 2022 Wei Shen, Xiaonan He, Chuheng Zhang, Xuyun Zhang, Jian Xie

Moreover, they are trained and evaluated on the benchmark datasets with adequate labels, which are expensive to obtain in a commercial dialogue system.

Spoken Dialogue Systems

Shift Convolution Network for Stereo Matching

no code implementations20 Nov 2019 Jian Xie

In this paper, we present Shift Convolution Network (ShiftConvNet) to provide matching capability between two feature maps for stereo estimation.

Stereo Matching

Cannot find the paper you are looking for? You can Submit a new open access paper.