Search Results for author: Cheng Qian

Found 55 papers, 22 papers with code

ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World Challenges

1 code implementation21 May 2025 Cheng Qian, Hongyi Du, Hongru Wang, Xiusi Chen, Yuji Zhang, Avirup Sil, ChengXiang Zhai, Kathleen McKeown, Heng Ji

ModelingBench also supports multiple valid solutions, capturing the ambiguity and creativity of practical modeling.

Math valid

RM-R1: Reward Modeling as Reasoning

1 code implementation5 May 2025 Xiusi Chen, Gaotang Li, Ziqi Wang, Bowen Jin, Cheng Qian, Yu Wang, Hongru Wang, Yu Zhang, Denghui Zhang, Tong Zhang, Hanghang Tong, Heng Ji

The training of RM-R1 consists of two key stages: (1) distillation of high-quality reasoning chains and (2) reinforcement learning with verifiable rewards.

Math Reinforcement Learning (RL)

OTC: Optimal Tool Calls via Reinforcement Learning

no code implementations21 Apr 2025 Hongru Wang, Cheng Qian, Wanjun Zhong, Xiusi Chen, Jiahao Qiu, Shijue Huang, Bowen Jin, Mengdi Wang, Kam-Fai Wong, Heng Ji

Tool-integrated reasoning (TIR) augments large language models (LLMs) with the ability to invoke external tools, such as search engines and code interpreters, to solve tasks beyond the capabilities of language-only reasoning.

Math reinforcement-learning +2

ToolRL: Reward is All Tool Learning Needs

no code implementations16 Apr 2025 Cheng Qian, Emre Can Acikgoz, Qi He, Hongru Wang, Xiusi Chen, Dilek Hakkani-Tür, Gokhan Tur, Heng Ji

In this work, we present the first comprehensive study on reward design for tool selection and application tasks within the RL paradigm.

All Reinforcement Learning (RL)

Alice: Proactive Learning with Teacher's Demonstrations for Weak-to-Strong Generalization

1 code implementation9 Apr 2025 Shujin Wu, Cheng Qian, Yi R., Fung, Paul Pu Liang, Heng Ji

In this work, we introduce Alice (pro{A}ctive {l}earning w{i}th tea{c}her's D{e}monstrations), a framework that leverages complementary knowledge between teacher and student to enhance the learning process. We probe the knowledge base of the teacher model by eliciting their uncertainty, and then use these insights together with teachers' responses as demonstrations to guide student models in self-generating improved responses for supervision.

Logical Reasoning Mathematical Reasoning +1

A Desideratum for Conversational Agents: Capabilities, Challenges, and Future Directions

1 code implementation7 Apr 2025 Emre Can Acikgoz, Cheng Qian, Hongru Wang, Vardhan Dongre, Xiusi Chen, Heng Ji, Dilek Hakkani-Tür, Gokhan Tur

Recent advances in Large Language Models (LLMs) have propelled conversational AI from traditional dialogue systems into sophisticated agents capable of autonomous actions, contextual awareness, and multi-turn interactions with users.

AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset

no code implementations4 Apr 2025 Bingxiang He, Wenbin Zhang, Jiaxi Song, Cheng Qian, Zixuan Fu, Bowen Sun, Ning Ding, Haiwen Hong, Longtao Huang, Hui Xue, Ganqu Cui, Wanxiang Che, Zhiyuan Liu, Maosong Sun

Preference learning is critical for aligning large language models (LLMs) with human values, yet its success hinges on high-quality datasets comprising three core components: Preference \textbf{A}nnotations, \textbf{I}nstructions, and \textbf{R}esponse Pairs.

MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents

1 code implementation3 Mar 2025 Kunlun Zhu, Hongyi Du, Zhaochen Hong, Xiaocheng Yang, Shuyi Guo, Zhe Wang, Zhenhailong Wang, Cheng Qian, Xiangru Tang, Heng Ji, Jiaxuan You

Large Language Models (LLMs) have shown remarkable capabilities as autonomous agents, yet existing benchmarks either focus on single-agent tasks or are confined to narrow domains, failing to capture the dynamics of multi-agent coordination and competition.

The Law of Knowledge Overshadowing: Towards Understanding, Predicting, and Preventing LLM Hallucination

no code implementations22 Feb 2025 Yuji Zhang, Sha Li, Cheng Qian, Jiateng Liu, Pengfei Yu, Chi Han, Yi R. Fung, Kathleen McKeown, ChengXiang Zhai, Manling Li, Heng Ji

To address it, we propose a novel concept: knowledge overshadowing, where model's dominant knowledge can obscure less prominent knowledge during text generation, causing the model to fabricate inaccurate details.

Hallucination Text Generation

SMART: Self-Aware Agent for Tool Overuse Mitigation

1 code implementation17 Feb 2025 Cheng Qian, Emre Can Acikgoz, Hongru Wang, Xiusi Chen, Avirup Sil, Dilek Hakkani-Tür, Gokhan Tur, Heng Ji

To support this paradigm, we introduce SMART-ER, a dataset spanning three domains, where reasoning alternates between parametric knowledge and tool-dependent steps, with each step enriched by rationales explaining when tools are necessary.

GSM8K Large Language Model

Internal Activation as the Polar Star for Steering Unsafe LLM Behavior

no code implementations3 Feb 2025 Peixuan Han, Cheng Qian, Xiusi Chen, Yuji Zhang, Denghui Zhang, Heng Ji

Large language models (LLMs) have demonstrated exceptional capabilities across a wide range of tasks but also pose significant risks due to their potential to generate harmful content.

Safety Alignment

EscapeBench: Pushing Language Models to Think Outside the Box

1 code implementation18 Dec 2024 Cheng Qian, Peixuan Han, Qinyu Luo, Bingxiang He, Xiusi Chen, Yuji Zhang, Hongyi Du, Jiarui Yao, Xiaocheng Yang, Denghui Zhang, Yunzhu Li, Heng Ji

Language model agents excel in long-session planning and reasoning, but existing benchmarks primarily focus on goal-oriented tasks with explicit objectives, neglecting creative adaptation in unfamiliar environments.

Language Modeling Language Modelling

FastAttention: Extend FlashAttention2 to NPUs and Low-resource GPUs

no code implementations22 Oct 2024 Haoran Lin, Xianzhi Yu, Kang Zhao, Lu Hou, Zongyuan Zhan, Stanislav Kamenev, Han Bao, Ting Hu, Mingkai Wang, Qixin Chang, Siyue Sui, Weihao Sun, Jiaxin Hu, Jun Yao, Zekun Yin, Cheng Qian, Ying Zhang, Yinfei Pan, Yu Yang, Weiguo Liu

In this work, we propose FastAttention which pioneers the adaptation of FlashAttention series for NPUs and low-resource GPUs to boost LLM inference efficiency.

Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs

1 code implementation18 Oct 2024 Runchu Tian, Yanghao Li, Yuepeng Fu, Siyang Deng, Qinyu Luo, Cheng Qian, Shuo Wang, Xin Cong, Zhong Zhang, Yesai Wu, Yankai Lin, Huadong Wang, Xiaojiang Liu

These experiments reveal that while most current models are robust against the "lost in the middle" issue, there exist significant biases related to the spacing of relevant information pieces.

Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance

1 code implementation16 Oct 2024 Yaxi Lu, Shenzhi Yang, Cheng Qian, Guirong Chen, Qinyu Luo, Yesai Wu, Huadong Wang, Xin Cong, Zhong Zhang, Yankai Lin, Weiwen Liu, Yasheng Wang, Zhiyuan Liu, Fangming Liu, Maosong Sun

The labeled data is used to train a reward model that simulates human judgment and serves as an automatic evaluator of the proactiveness of LLM agents.

Optimized Biomedical Question-Answering Services with LLM and Multi-BERT Integration

no code implementations11 Oct 2024 Cheng Qian, Xianglong Shi, Shanshan Yao, Yichen Liu, Fengming Zhou, Zishu Zhang, Junaid Akram, Ali Braytee, Ali Anaissi

We present a refined approach to biomedical question-answering (QA) services by integrating large language models (LLMs) with Multi-BERT configurations.

Decision Making Question Answering

Aligning LLMs with Individual Preferences via Interaction

1 code implementation4 Oct 2024 Shujin Wu, May Fung, Cheng Qian, Jeonghwan Kim, Dilek Hakkani-Tur, Heng Ji

To address this gap, we train LLMs that can ''interact to align'', essentially cultivating the meta-skill of LLMs to implicitly infer the unspoken personalized preferences of the current user through multi-turn conversations, and then dynamically align their following behaviors and responses to these inferred preferences.

HSF: Defending against Jailbreak Attacks with Hidden State Filtering

no code implementations31 Aug 2024 Cheng Qian, Hainan Zhang, Lei Sha, Zhiming Zheng

With the growing deployment of LLMs in daily applications like chatbots and content generation, efforts to ensure outputs align with human values and avoid harmful content have intensified.

LLM Jailbreak

Practical token pruning for foundation models in few-shot conversational virtual assistant systems

no code implementations21 Aug 2024 Haode Qi, Cheng Qian, Jian Ni, Pratyush Singh, Reza Fazeli, Gengyu Wang, Zhongzheng Shu, Eric Wayne, Juergen Bross

The VA system is expected to be a cost-efficient SaaS service with low training and inference time while achieving high accuracy even with a small number of training samples.

Classification Contrastive Learning +5

PianoMime: Learning a Generalist, Dexterous Piano Player from Internet Demonstrations

no code implementations25 Jul 2024 Cheng Qian, Julen Urain, Kevin Zakka, Jan Peters

In this work, we introduce PianoMime, a framework for training a piano-playing agent using internet demonstrations.

Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity

no code implementations17 Jun 2024 Bingxiang He, Ning Ding, Cheng Qian, Jia Deng, Ganqu Cui, Lifan Yuan, Huan-ang Gao, Huimin Chen, Zhiyuan Liu, Maosong Sun

For the first time, we show that zero-shot generalization during instruction tuning is a form of similarity-based generalization between training and test data at the instance level.

Continual Learning Zero-shot Generalization

LanePtrNet: Revisiting Lane Detection as Point Voting and Grouping on Curves

no code implementations8 Mar 2024 Jiayan Cao, Xueyu Zhu, Cheng Qian

from object detection and segmentation tasks, while these approaches require manual adjustments for curved objects, involve exhaustive searches on predefined anchors, require complex post-processing steps, and may lack flexibility when applied to real-world scenarios. In this paper, we propose a novel approach, LanePtrNet, which treats lane detection as a process of point voting and grouping on ordered sets: Our method takes backbone features as input and predicts a curve-aware centerness, which represents each lane as a point and assigns the most probable center point to it.

3D Lane Detection Autonomous Driving +2

Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents

1 code implementation14 Feb 2024 Cheng Qian, Bingxiang He, Zhong Zhuang, Jia Deng, Yujia Qin, Xin Cong, Zhong Zhang, Jie zhou, Yankai Lin, Zhiyuan Liu, Maosong Sun

Current language model-driven agents often lack mechanisms for effective user participation, which is crucial given the vagueness commonly found in user instructions.

Language Modeling Language Modelling

Investigate-Consolidate-Exploit: A General Strategy for Inter-Task Agent Self-Evolution

no code implementations25 Jan 2024 Cheng Qian, Shihao Liang, Yujia Qin, Yining Ye, Xin Cong, Yankai Lin, Yesai Wu, Zhiyuan Liu, Maosong Sun

This paper introduces Investigate-Consolidate-Exploit (ICE), a novel strategy for enhancing the adaptability and flexibility of AI agents through inter-task self-evolution.

Triple Simplex Matrix Completion for Expense Forecasting

no code implementations23 Oct 2023 Cheng Qian, Lucas Glass, Nikos Sidiropoulos

Forecasting project expenses is a crucial step for businesses to avoid budget overruns and project failures.

Matrix Completion Time Series +1

Toolink: Linking Toolkit Creation and Using through Chain-of-Solving on Open-Source Model

1 code implementation8 Oct 2023 Cheng Qian, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu

We first validate the efficacy of Toolink in harnessing the model's creativity and CoS ability on ChatGPT.

valid

"Merge Conflicts!" Exploring the Impacts of External Distractors to Parametric Knowledge Graphs

1 code implementation15 Sep 2023 Cheng Qian, Xinran Zhao, Sherry Tongshuang Wu

Large language models (LLMs) acquire extensive knowledge during pre-training, known as their parametric knowledge.

Hallucination Knowledge Graphs

The 2nd Place Solution for 2023 Waymo Open Sim Agents Challenge

no code implementations28 Jun 2023 Cheng Qian, Di Xiu, Minghao Tian

In this technical report, we present the 2nd place solution of 2023 Waymo Open Sim Agents Challenge (WOSAC)[4].

Motion Forecasting

CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models

2 code implementations23 May 2023 Cheng Qian, Chi Han, Yi R. Fung, Yujia Qin, Zhiyuan Liu, Heng Ji

Additionally, we introduce the Creation Challenge dataset, featuring 2K diverse questions, to emphasize the necessity and benefits of LLMs' tool creation ability.

2k Math +1

Recyclable Tuning for Continual Pre-training

1 code implementation15 May 2023 Yujia Qin, Cheng Qian, Xu Han, Yankai Lin, Huadong Wang, Ruobing Xie, Zhiyuan Liu, Maosong Sun, Jie zhou

In pilot studies, we find that after continual pre-training, the upgraded PLM remains compatible with the outdated adapted weights to some extent.

Distinguish Sense from Nonsense: Out-of-Scope Detection for Virtual Assistants

no code implementations16 Jan 2023 Cheng Qian, Haode Qi, Gengyu Wang, Ladislav Kunc, Saloni Potdar

Out of Scope (OOS) detection in Conversational AI solutions enables a chatbot to handle a conversation gracefully when it is unable to make sense of the end-user query.

Chatbot

Exploring Mode Connectivity for Pre-trained Language Models

1 code implementation25 Oct 2022 Yujia Qin, Cheng Qian, Jing Yi, Weize Chen, Yankai Lin, Xu Han, Zhiyuan Liu, Maosong Sun, Jie zhou

(3) How does the PLM's task knowledge change along the path connecting two minima?

GOCPT: Generalized Online Canonical Polyadic Tensor Factorization and Completion

1 code implementation8 May 2022 Chaoqi Yang, Cheng Qian, Jimeng Sun

Our variant GOCPTE shows up to 1:2% and 5:5% fitness improvement on two datasets with about 20% speedup compared to the best model.

ATD: Augmenting CP Tensor Decomposition by Self Supervision

1 code implementation15 Jun 2021 Chaoqi Yang, Cheng Qian, Navjot Singh, Cao Xiao, M Brandon Westover, Edgar Solomonik, Jimeng Sun

This paper addresses the above challenges by proposing augmented tensor decomposition (ATD), which effectively incorporates data augmentations and self-supervised learning (SSL) to boost downstream classification.

Data Augmentation Dimensionality Reduction +3

MTC: Multiresolution Tensor Completion from Partial and Coarse Observations

1 code implementation14 Jun 2021 Chaoqi Yang, Navjot Singh, Cao Xiao, Cheng Qian, Edgar Solomonik, Jimeng Sun

Our MTC model explores tensor mode properties and leverages the hierarchy of resolutions to recursively initialize an optimization setup, and optimizes on the coupled system using alternating least squares.

Condition Integration Memory Network: An Interpretation of the Meaning of the Neuronal Design

no code implementations21 May 2021 Cheng Qian

When a neuron's activation represents some symbolic element in the environment, each of its synapses can indicate a potential change to the element and its future state.

Multi-version Tensor Completion for Time-delayed Spatio-temporal Data

no code implementations11 May 2021 Cheng Qian, Nikos Kargas, Cao Xiao, Lucas Glass, Nicholas Sidiropoulos, Jimeng Sun

Recovering such missing or noisy (under-reported) elements of the input tensor can be viewed as a generalized tensor completion problem.

Missing Elements

STELAR: Spatio-temporal Tensor Factorization with Latent Epidemiological Regularization

no code implementations8 Dec 2020 Nikos Kargas, Cheng Qian, Nicholas D. Sidiropoulos, Cao Xiao, Lucas M. Glass, Jimeng Sun

Accurate prediction of the transmission of epidemic diseases such as COVID-19 is crucial for implementing effective mitigation measures.

Attribute Prediction

Learning Barrier Functions with Memory for Robust Safe Navigation

no code implementations3 Nov 2020 Kehan Long, Cheng Qian, Jorge Cortés, Nikolay Atanasov

Control barrier functions are widely used to enforce safety properties in robot motion planning and control.

Motion Planning Robotics

SWIFT: Scalable Wasserstein Factorization for Sparse Nonnegative Tensors

no code implementations8 Oct 2020 Ardavan Afshar, Kejing Yin, Sherry Yan, Cheng Qian, Joyce C. Ho, Haesun Park, Jimeng Sun

In particular, we define the N-th order tensor Wasserstein loss for the widely used tensor CP factorization and derive the optimization algorithm that minimizes it.

Computational Efficiency

On the Compression of Translation Operator Tensors in FMM-FFT-Accelerated SIE Simulators via Tensor Decompositions

no code implementations25 Sep 2020 Cheng Qian, Abdulkadir C. Yucel

Tensor decomposition methodologies are proposed to reduce the memory requirement of translation operator tensors arising in the fast multipole method-fast Fourier transform (FMM-FFT)-accelerated surface integral equation (SIE) simulators.

Tensor Decomposition Translation

Model-aided Deep Neural Network for Source Number Detection

no code implementations29 Sep 2019 Yuwen Yang, Feifei Gao, Cheng Qian, Guisheng Liao

Specifically, we first propose the eigenvalue based regression network (ERNet) and classification network (ECNet) to estimate the number of non-coherent sources, where the eigenvalues of the received signal covariance matrix and the source number are used as the input and the supervise label of the networks, respectively.

REP: Predicting the Time-Course of Drug Sensitivity

no code implementations27 Jul 2019 Cheng Qian, Amin Emad, Nicholas D. Sidiropoulos

Time-course gene expression data is a rich source of information that can be used to unravel these complex processes, identify biomarkers of drug sensitivity and predict the response to a drug.

Drug Response Prediction

High-dimensional Gaussian graphical model for network-linked data

1 code implementation4 Jul 2019 Tianxi Li, Cheng Qian, Elizaveta Levina, Ji Zhu

Graphical models are commonly used to represent conditional dependence relationships between variables.

Vocal Bursts Intensity Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.