Search Results for author: Dongkuan Xu

Found 43 papers, 17 papers with code

Inference Scaled GraphRAG: Improving Multi Hop Question Answering on Knowledge Graphs

no code implementations24 Jun 2025 Travis Thompson, Seung-Hwan Lim, Paul Liu, Ruoying He, Dongkuan Xu

Large Language Models (LLMs) have achieved impressive capabilities in language understanding and generation, yet they continue to underperform on knowledge-intensive reasoning tasks due to limited access to structured context and multi-hop information.

Information Retrieval Knowledge Graphs +4

AdAEM: An Adaptively and Automated Extensible Measurement of LLMs' Value Difference

no code implementations18 May 2025 Shitong Duan, Xiaoyuan Yi, Peng Zhang, Dongkuan Xu, Jing Yao, Tun Lu, Ning Gu, Xing Xie

Assessing Large Language Models (LLMs)' underlying value differences enables comprehensive comparison of their misalignment, cultural adaptability, and biases.

Informativeness

Leveraging Implicit Sentiments: Enhancing Reliability and Validity in Psychological Trait Evaluation of LLMs

1 code implementation26 Mar 2025 Huanhuan Ma, Haisong Gong, Xiaoyuan Yi, Xing Xie, Dongkuan Xu

Through extensive experiments, we demonstrate that: 1) CSI effectively captures nuanced emotional patterns, revealing significant variation in LLMs across languages and contexts; 2) Compared to current approaches, CSI significantly improves reliability, yielding more consistent results; and 3) The correlation between CSI scores and the sentiment of LLM's real-world outputs exceeds 0. 85, demonstrating its strong validity in predicting LLM behavior.

Exploring Multi-Modal Data with Tool-Augmented LLM Agents for Precise Causal Discovery

1 code implementation18 Dec 2024 ChengAo Shen, Zhengzhang Chen, Dongsheng Luo, Dongkuan Xu, Haifeng Chen, Jingchao Ni

The advent of Large Language Models (LLMs) has ushered in an affordable way of leveraging the semantic cues for knowledge-driven causal discovery, but the development of LLMs for causal discovery lags behind other areas, particularly in the exploration of multi-modal data.

Causal Discovery Causal Inference

Digital Twin-Assisted Data-Driven Optimization for Reliable Edge Caching in Wireless Networks

no code implementations29 Jun 2024 Zifan Zhang, Yuchen Liu, Zhiyuan Peng, Mingzhe Chen, Dongkuan Xu, Shuguang Cui

To bridge this gap, we introduce a novel digital twin-assisted optimization framework, called D-REC, which integrates reinforcement learning (RL) with diverse intervention modules to ensure reliable caching in nextG wireless networks.

Reinforcement Learning (RL)

Adaptive Draft-Verification for Efficient Large Language Model Decoding

no code implementations27 Jun 2024 Xukun Liu, Bowen Lei, Ruqi Zhang, Dongkuan Xu

Large language model (LLM) decoding involves generating a sequence of tokens based on a given context, where each token is predicted one at a time using the model's learned probabilities.

Language Modeling Language Modelling +1

DALD: Improving Logits-based Detector without Logits from Black-box LLMs

1 code implementation7 Jun 2024 Cong Zeng, Shengkun Tang, Xianjun Yang, Yuanzhou Chen, Yiyou Sun, Zhiqiang Xu, Yao Li, Haifeng Chen, Wei Cheng, Dongkuan Xu

However, these methods grapple with the misalignment between the distributions of the surrogate and the often undisclosed target models, leading to performance degradation, particularly with the introduction of new, closed-source models.

Text Detection Text Generation

Embracing Unknown Step by Step: Towards Reliable Sparse Training in Real World

1 code implementation29 Mar 2024 Bowen Lei, Dongkuan Xu, Ruqi Zhang, Bani Mallick

Sparse training has emerged as a promising method for resource-efficient deep neural networks (DNNs) in real-world applications.

ToolNet: Connecting Large Language Models with Massive Tools via Tool Graph

no code implementations29 Feb 2024 Xukun Liu, Zhiyuan Peng, Xiaoyuan Yi, Xing Xie, Lirong Xiang, Yuchen Liu, Dongkuan Xu

While achieving remarkable progress in a broad range of tasks, large language models (LLMs) remain significantly limited in properly using massive external tools.

In-Context Learning

Students' Perceptions and Preferences of Generative Artificial Intelligence Feedback for Programming

no code implementations17 Dec 2023 Zhengdong Zhang, Zihan Dong, Yang Shi, Noboru Matsuda, Thomas Price, Dongkuan Xu

This study demonstrated that ChatGPT could generate Java programming assignment feedback that students perceived as formative.

Specificity

FP8-BERT: Post-Training Quantization for Transformer

no code implementations10 Dec 2023 Jianwei Li, Tianchi Zhang, Ian En-Hsu Yen, Dongkuan Xu

Transformer-based models, such as BERT, have been widely applied in a wide range of natural language processing tasks.

Quantization

Towards Robust Pruning: An Adaptive Knowledge-Retention Pruning Strategy for Language Models

no code implementations19 Oct 2023 Jianwei Li, Qi Lei, Wei Cheng, Dongkuan Xu

The pruning objective has recently extended beyond accuracy and sparsity to robustness in language models.

Breaking through Deterministic Barriers: Randomized Pruning Mask Generation and Selection

no code implementations19 Oct 2023 Jianwei Li, Weizhi Gao, Qi Lei, Dongkuan Xu

It is widely acknowledged that large and sparse models have higher accuracy than small and dense models under the same model size constraints.

AdaDiff: Accelerating Diffusion Models through Step-Wise Adaptive Computation

no code implementations29 Sep 2023 Shengkun Tang, Yaqing Wang, Caiwen Ding, Yi Liang, Yao Li, Dongkuan Xu

Unlike typical adaptive computation challenges that deal with single-step generation problems, diffusion processes with a multi-step generation need to dynamically adjust their computational resource allocation based on the ongoing assessment of each step's importance to the final image output, presenting a unique set of challenges.

text-guided-generation

Gentopia: A Collaborative Platform for Tool-Augmented LLMs

1 code implementation8 Aug 2023 Binfeng Xu, Xukun Liu, Hua Shen, Zeyu Han, Yuhan Li, Murong Yue, Zhiyuan Peng, Yuchen Liu, Ziyu Yao, Dongkuan Xu

We present gentopia, an ALM framework enabling flexible customization of agents through simple configurations, seamlessly integrating various language models, task formats, prompting modules, and plugins into a unified paradigm.

Rethinking Data Distillation: Do Not Overlook Calibration

1 code implementation ICCV 2023 Dongyao Zhu, Bowen Lei, Jie Zhang, Yanbo Fang, Ruqi Zhang, Yiqun Xie, Dongkuan Xu

Neural networks trained on distilled data often produce over-confident output and require correction by calibration methods.

Dataset Distillation

Towards Reliable Rare Category Analysis on Graphs via Individual Calibration

1 code implementation19 Jul 2023 Longfeng Wu, Bowen Lei, Dongkuan Xu, Dawei Zhou

In particular, to quantify the uncertainties in RCA, we develop a node-level uncertainty quantification algorithm to model the overlapping support regions with high uncertainty; to handle the rarity of minority classes in miscalibration calculation, we generalize the distribution-based calibration metric to the instance level and propose the first individual calibration measurement on graphs named Expected Individual Calibration Error (EICE).

Fraud Detection Network Intrusion Detection +1

AutoST: Training-free Neural Architecture Search for Spiking Transformers

1 code implementation1 Jul 2023 Ziqing Wang, Qidong Zhao, Jinku Cui, Xu Liu, Dongkuan Xu

To address these limitations, we introduce AutoST, a training-free NAS method for Spiking Transformers, to rapidly identify high-performance Spiking Transformer architectures.

Neural Architecture Search

ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models

2 code implementations23 May 2023 Binfeng Xu, Zhiyuan Peng, Bowen Lei, Subhabrata Mukherjee, Yuchen Liu, Dongkuan Xu

Augmented Language Models (ALMs) blend the reasoning capabilities of Large Language Models (LLMs) with tools that allow for knowledge retrieval and action execution.

Retrieval

Neurogenesis Dynamics-inspired Spiking Neural Network Training Acceleration

no code implementations24 Apr 2023 Shaoyi Huang, Haowen Fang, Kaleel Mahmood, Bowen Lei, Nuo Xu, Bin Lei, Yue Sun, Dongkuan Xu, Wujie Wen, Caiwen Ding

Experimental results show that NDSNN achieves up to 20. 52\% improvement in accuracy on Tiny-ImageNet using ResNet-19 (with a sparsity of 99\%) as compared to other SOTA methods (e. g., Lottery Ticket Hypothesis (LTH), SET-SNN, RigL-SNN).

Time Series Contrastive Learning with Information-Aware Augmentations

1 code implementation21 Mar 2023 Dongsheng Luo, Wei Cheng, Yingheng Wang, Dongkuan Xu, Jingchao Ni, Wenchao Yu, Xuchao Zhang, Yanchi Liu, Yuncong Chen, Haifeng Chen, Xiang Zhang

A key component of contrastive learning is to select appropriate augmentations imposing some priors to construct feasible positive samples, such that an encoder can be trained to learn robust and discriminative representations.

Contrastive Learning Open-Ended Question Answering +2

Efficient Informed Proposals for Discrete Distributions via Newton's Series Approximation

no code implementations27 Feb 2023 Yue Xiang, Dongyao Zhu, Bowen Lei, Dongkuan Xu, Ruqi Zhang

Gradients have been exploited in proposal distributions to accelerate the convergence of Markov chain Monte Carlo algorithms on discrete distributions.

Efficient Exploration Extractive Text Summarization +2

Calibrating the Rigged Lottery: Making All Tickets Reliable

1 code implementation18 Feb 2023 Bowen Lei, Ruqi Zhang, Dongkuan Xu, Bani Mallick

Previous research has shown that deep neural networks tend to be over-confident, and we find that sparse training exacerbates this problem.

All Decision Making

Balance is Essence: Accelerating Sparse Training via Adaptive Gradient Correction

1 code implementation9 Jan 2023 Bowen Lei, Dongkuan Xu, Ruqi Zhang, Shuren He, Bani K. Mallick

To accelerate and stabilize the convergence of sparse training, we analyze the gradient changes and develop an adaptive gradient correction method.

Accelerating Dataset Distillation via Model Augmentation

2 code implementations CVPR 2023 Lei Zhang, Jie Zhang, Bowen Lei, Subhabrata Mukherjee, Xiang Pan, Bo Zhao, Caiwen Ding, Yao Li, Dongkuan Xu

Dataset Distillation (DD), a newly emerging field, aims at generating much smaller but efficient synthetic training datasets from large ones.

Dataset Distillation model

Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off

no code implementations30 Nov 2022 Shaoyi Huang, Bowen Lei, Dongkuan Xu, Hongwu Peng, Yue Sun, Mimi Xie, Caiwen Ding

We further design an acquisition function and provide the theoretical guarantees for the proposed method and clarify its convergence property.

You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model

1 code implementation CVPR 2023 Shengkun Tang, Yaqing Wang, Zhenglun Kong, Tianchi Zhang, Yao Li, Caiwen Ding, Yanzhi Wang, Yi Liang, Dongkuan Xu

To handle this challenge, we propose a novel early exiting strategy for unified visual language models, which allows dynamically skip the layers in encoder and decoder simultaneously in term of input layer-wise similarities with multiple times of early exiting, namely \textbf{MuE}.

Decoder Language Modeling +1

S4: a High-sparsity, High-performance AI Accelerator

no code implementations16 Jul 2022 Ian En-Hsu Yen, Zhibin Xiao, Dongkuan Xu

And the degree of sparsity one can exploit has become higher as larger model sizes have been considered along with the trend of pre-training giant models.

Quantization Vocal Bursts Intensity Prediction

AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models

no code implementations29 Jan 2022 Dongkuan Xu, Subhabrata Mukherjee, Xiaodong Liu, Debadeepta Dey, Wenhui Wang, Xiang Zhang, Ahmed Hassan Awadallah, Jianfeng Gao

Our framework AutoDistil addresses above challenges with the following steps: (a) Incorporates inductive bias and heuristics to partition Transformer search space into K compact sub-spaces (K=3 for typical student sizes of base, small and tiny); (b) Trains one SuperLM for each sub-space using task-agnostic objective (e. g., self-attention distillation) with weight-sharing of students; (c) Lightweight search for the optimal student without re-training.

Inductive Bias Knowledge Distillation +1

InfoGCL: Information-Aware Graph Contrastive Learning

no code implementations NeurIPS 2021 Dongkuan Xu, Wei Cheng, Dongsheng Luo, Haifeng Chen, Xiang Zhang

The key point of this framework is to follow the Information Bottleneck principle to reduce the mutual information between contrastive parts while keeping task-relevant information intact at both the levels of the individual module and the entire framework so that the information loss during graph representation learning can be minimized.

Contrastive Learning Graph Classification +3

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm

no code implementations ACL 2022 Shaoyi Huang, Dongkuan Xu, Ian E. H. Yen, Yijue Wang, Sung-En Chang, Bingbing Li, Shiyang Chen, Mimi Xie, Sanguthevar Rajasekaran, Hang Liu, Caiwen Ding

Conventional wisdom in pruning Transformer-based language models is that pruning reduces the model expressiveness and thus is more likely to underfit rather than overfit.

Knowledge Distillation

Information-Aware Time Series Meta-Contrastive Learning

no code implementations29 Sep 2021 Dongsheng Luo, Wei Cheng, Yingheng Wang, Dongkuan Xu, Jingchao Ni, Wenchao Yu, Xuchao Zhang, Yanchi Liu, Haifeng Chen, Xiang Zhang

How to find the desired augmentations of time series data that are meaningful for given contrastive learning tasks and datasets remains an open question.

Contrastive Learning Meta-Learning +4

Data Augmentation with Adversarial Training for Cross-Lingual NLI

no code implementations ACL 2021 Xin Dong, Yaxin Zhu, Zuohui Fu, Dongkuan Xu, Gerard de Melo

Due to recent pretrained multilingual representation models, it has become feasible to exploit labeled data from one language to train a cross-lingual model that can then be applied to multiple new languages.

Cross-Lingual Natural Language Inference Data Augmentation +1

Rethinking Network Pruning -- under the Pre-train and Fine-tune Paradigm

1 code implementation NAACL 2021 Dongkuan Xu, Ian E. H. Yen, Jinxi Zhao, Zhibin Xiao

In particular, common wisdom in pruning CNN states that sparse pruning technique compresses a model more than that obtained by reducing number of channels and layers (Elsen et al., 2020; Zhu and Gupta, 2017), while existing works on sparse pruning of BERT yields inferior results than its small-dense counterparts such as TinyBERT (Jiao et al., 2020).

Network Pruning

Parameterized Explainer for Graph Neural Network

4 code implementations NeurIPS 2020 Dongsheng Luo, Wei Cheng, Dongkuan Xu, Wenchao Yu, Bo Zong, Haifeng Chen, Xiang Zhang

The unique explanation interpreting each instance independently is not sufficient to provide a global understanding of the learned GNN model, leading to a lack of generalizability and hindering it from being used in the inductive setting.

Graph Classification Graph Neural Network

Leveraging Adversarial Training in Self-Learning for Cross-Lingual Text Classification

no code implementations29 Jul 2020 Xin Dong, Yaxin Zhu, Yupeng Zhang, Zuohui Fu, Dongkuan Xu, Sen yang, Gerard de Melo

The resulting model then serves as a teacher to induce labels for unlabeled target language samples that can be used during further adversarial training, allowing us to gradually adapt our model to the target language.

General Classification intent-classification +4

Longitudinal Deep Kernel Gaussian Process Regression

no code implementations24 May 2020 Junjie Liang, Yanting Wu, Dongkuan Xu, Vasant Honavar

Specifically, L-DKGPR eliminates the need for ad hoc heuristics or trial and error using a novel adaptation of deep kernel learning that combines the expressive power of deep neural networks with the flexibility of non-parametric kernel methods.

Gaussian Processes regression +1

How Do We Move: Modeling Human Movement with System Dynamics

no code implementations1 Mar 2020 Hua Wei, Dongkuan Xu, Junjie Liang, Zhenhui Li

To the best of our knowledge, we are the first to learn to model the state transition of moving agents with system dynamics.

Imitation Learning

LMLFM: Longitudinal Multi-Level Factorization Machine

1 code implementation11 Nov 2019 Junjie Liang, Dongkuan Xu, Yiwei Sun, Vasant Honavar

However, the current state-of-the-art methods are unable to select the most predictive fixed effects and random effects from a large number of variables, while accounting for complex correlation structure in the data and non-linear interactions among the variables.

Variable Selection

PIGMIL: Positive Instance Detection via Graph Updating for Multiple Instance Learning

no code implementations12 Dec 2016 Dongkuan Xu, Jia Wu, Wei zhang, Yingjie Tian

To the end, we propose a positive instance detection via graph updating for multiple instance learning, called PIGMIL, to detect TPI accurately.

Multiple Instance Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.