Search Results for author: Dongkuan Xu

Found 36 papers, 14 papers with code

Embracing Unknown Step by Step: Towards Reliable Sparse Training in Real World

1 code implementation • 29 Mar 2024 • Bowen Lei, Dongkuan Xu, Ruqi Zhang, Bani Mallick

Sparse training has emerged as a promising method for resource-efficient deep neural networks (DNNs) in real-world applications.

Paper
Code

On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models

no code implementations • 7 Mar 2024 • Xinpeng Wang, Shitong Duan, Xiaoyuan Yi, Jing Yao, Shanlin Zhou, Zhihua Wei, Peng Zhang, Dongkuan Xu, Maosong Sun, Xing Xie

Big models have achieved revolutionary breakthroughs in the field of AI, but they might also pose potential concerns.

In-Context Learning

Paper
Add Code

ToolNet: Connecting Large Language Models with Massive Tools via Tool Graph

no code implementations • 29 Feb 2024 • Xukun Liu, Zhiyuan Peng, Xiaoyuan Yi, Xing Xie, Lirong Xiang, Yuchen Liu, Dongkuan Xu

While achieving remarkable progress in a broad range of tasks, large language models (LLMs) remain significantly limited in properly using massive external tools.

In-Context Learning

Paper
Add Code

Students' Perceptions and Preferences of Generative Artificial Intelligence Feedback for Programming

no code implementations • 17 Dec 2023 • Zhengdong Zhang, Zihan Dong, Yang Shi, Noboru Matsuda, Thomas Price, Dongkuan Xu

This study demonstrated that ChatGPT could generate Java programming assignment feedback that students perceived as formative.

Specificity

Paper
Add Code

FP8-BERT: Post-Training Quantization for Transformer

no code implementations • 10 Dec 2023 • Jianwei Li, Tianchi Zhang, Ian En-Hsu Yen, Dongkuan Xu

Transformer-based models, such as BERT, have been widely applied in a wide range of natural language processing tasks.

Quantization

Paper
Add Code

Towards Robust Pruning: An Adaptive Knowledge-Retention Pruning Strategy for Language Models

no code implementations • 19 Oct 2023 • Jianwei Li, Qi Lei, Wei Cheng, Dongkuan Xu

The pruning objective has recently extended beyond accuracy and sparsity to robustness in language models.

Paper
Add Code

Breaking through Deterministic Barriers: Randomized Pruning Mask Generation and Selection

no code implementations • 19 Oct 2023 • Jianwei Li, Weizhi Gao, Qi Lei, Dongkuan Xu

It is widely acknowledged that large and sparse models have higher accuracy than small and dense models under the same model size constraints.

Paper
Add Code

DeeDiff: Dynamic Uncertainty-Aware Early Exiting for Accelerating Diffusion Model Generation

no code implementations • 29 Sep 2023 • Shengkun Tang, Yaqing Wang, Caiwen Ding, Yi Liang, Yao Li, Dongkuan Xu

In this work, we propose DeeDiff, an early exiting framework that adaptively allocates computation resources in each sampling step to improve the generation efficiency of diffusion models.

text-guided-generation

Paper
Add Code

Gentopia: A Collaborative Platform for Tool-Augmented LLMs

1 code implementation • 8 Aug 2023 • Binfeng Xu, Xukun Liu, Hua Shen, Zeyu Han, Yuhan Li, Murong Yue, Zhiyuan Peng, Yuchen Liu, Ziyu Yao, Dongkuan Xu

We present gentopia, an ALM framework enabling flexible customization of agents through simple configurations, seamlessly integrating various language models, task formats, prompting modules, and plugins into a unified paradigm.

279

Paper
Code

Rethinking Data Distillation: Do Not Overlook Calibration

1 code implementation • ICCV 2023 • Dongyao Zhu, Bowen Lei, Jie Zhang, Yanbo Fang, Ruqi Zhang, Yiqun Xie, Dongkuan Xu

Neural networks trained on distilled data often produce over-confident output and require correction by calibration methods.

1,212

Paper
Code

Towards Reliable Rare Category Analysis on Graphs via Individual Calibration

1 code implementation • 19 Jul 2023 • Longfeng Wu, Bowen Lei, Dongkuan Xu, Dawei Zhou

In particular, to quantify the uncertainties in RCA, we develop a node-level uncertainty quantification algorithm to model the overlapping support regions with high uncertainty; to handle the rarity of minority classes in miscalibration calculation, we generalize the distribution-based calibration metric to the instance level and propose the first individual calibration measurement on graphs named Expected Individual Calibration Error (EICE).

Fraud Detection Network Intrusion Detection +1

Paper
Code

AutoST: Training-free Neural Architecture Search for Spiking Transformers

1 code implementation • 1 Jul 2023 • Ziqing Wang, Qidong Zhao, Jinku Cui, Xu Liu, Dongkuan Xu

To address these limitations, we introduce AutoST, a training-free NAS method for Spiking Transformers, to rapidly identify high-performance Spiking Transformer architectures.

Neural Architecture Search

Paper
Code

ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models

2 code implementations • 23 May 2023 • Binfeng Xu, Zhiyuan Peng, Bowen Lei, Subhabrata Mukherjee, Yuchen Liu, Dongkuan Xu

Augmented Language Models (ALMs) blend the reasoning capabilities of Large Language Models (LLMs) with tools that allow for knowledge retrieval and action execution.

Retrieval

856

Paper
Code

Neurogenesis Dynamics-inspired Spiking Neural Network Training Acceleration

no code implementations • 24 Apr 2023 • Shaoyi Huang, Haowen Fang, Kaleel Mahmood, Bowen Lei, Nuo Xu, Bin Lei, Yue Sun, Dongkuan Xu, Wujie Wen, Caiwen Ding

Experimental results show that NDSNN achieves up to 20. 52\% improvement in accuracy on Tiny-ImageNet using ResNet-19 (with a sparsity of 99\%) as compared to other SOTA methods (e. g., Lottery Ticket Hypothesis (LTH), SET-SNN, RigL-SNN).

Paper
Add Code

Time Series Contrastive Learning with Information-Aware Augmentations

1 code implementation • 21 Mar 2023 • Dongsheng Luo, Wei Cheng, Yingheng Wang, Dongkuan Xu, Jingchao Ni, Wenchao Yu, Xuchao Zhang, Yanchi Liu, Yuncong Chen, Haifeng Chen, Xiang Zhang

A key component of contrastive learning is to select appropriate augmentations imposing some priors to construct feasible positive samples, such that an encoder can be trained to learn robust and discriminative representations.

Contrastive Learning Open-Ended Question Answering +2

Paper
Code

Efficient Informed Proposals for Discrete Distributions via Newton's Series Approximation

no code implementations • 27 Feb 2023 • Yue Xiang, Dongyao Zhu, Bowen Lei, Dongkuan Xu, Ruqi Zhang

Gradients have been exploited in proposal distributions to accelerate the convergence of Markov chain Monte Carlo algorithms on discrete distributions.

Efficient Exploration Extractive Text Summarization +2

Paper
Add Code

Calibrating the Rigged Lottery: Making All Tickets Reliable

1 code implementation • 18 Feb 2023 • Bowen Lei, Ruqi Zhang, Dongkuan Xu, Bani Mallick

Previous research has shown that deep neural networks tend to be over-confident, and we find that sparse training exacerbates this problem.

Decision Making

Paper
Code

Balance is Essence: Accelerating Sparse Training via Adaptive Gradient Correction

1 code implementation • 9 Jan 2023 • Bowen Lei, Dongkuan Xu, Ruqi Zhang, Shuren He, Bani K. Mallick

To accelerate and stabilize the convergence of sparse training, we analyze the gradient changes and develop an adaptive gradient correction method.

Paper
Code

Accelerating Dataset Distillation via Model Augmentation

2 code implementations • CVPR 2023 • Lei Zhang, Jie Zhang, Bowen Lei, Subhabrata Mukherjee, Xiang Pan, Bo Zhao, Caiwen Ding, Yao Li, Dongkuan Xu

Dataset Distillation (DD), a newly emerging field, aims at generating much smaller but efficient synthetic training datasets from large ones.

1,212

Paper
Code

Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off

no code implementations • 30 Nov 2022 • Shaoyi Huang, Bowen Lei, Dongkuan Xu, Hongwu Peng, Yue Sun, Mimi Xie, Caiwen Ding

We further design an acquisition function and provide the theoretical guarantees for the proposed method and clarify its convergence property.

Paper
Add Code

You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model

1 code implementation • CVPR 2023 • Shengkun Tang, Yaqing Wang, Zhenglun Kong, Tianchi Zhang, Yao Li, Caiwen Ding, Yanzhi Wang, Yi Liang, Dongkuan Xu

To handle this challenge, we propose a novel early exiting strategy for unified visual language models, which allows dynamically skip the layers in encoder and decoder simultaneously in term of input layer-wise similarities with multiple times of early exiting, namely \textbf{MuE}.

Decoder Language Modelling

2,343

Paper
Code

A Survey for Efficient Open Domain Question Answering

no code implementations • 15 Nov 2022 • Qin Zhang, Shangsi Chen, Dongkuan Xu, Qingqing Cao, Xiaojun Chen, Trevor Cohn, Meng Fang

Thus, a trade-off between accuracy, memory consumption and processing speed is pursued.

Open-Domain Question Answering

Paper
Add Code

S4: a High-sparsity, High-performance AI Accelerator

no code implementations • 16 Jul 2022 • Ian En-Hsu Yen, Zhibin Xiao, Dongkuan Xu

And the degree of sparsity one can exploit has become higher as larger model sizes have been considered along with the trend of pre-training giant models.

Quantization Vocal Bursts Intensity Prediction

Paper
Add Code

An Automatic and Efficient BERT Pruning for Edge AI Systems

no code implementations • 21 Jun 2022 • Shaoyi Huang, Ning Liu, Yueying Liang, Hongwu Peng, Hongjia Li, Dongkuan Xu, Mimi Xie, Caiwen Ding

On MRPC, we obtain a 4. 6 higher score than the SOTA at the same overall pruning ratio of 0. 5.

Model Compression MRPC +4

Paper
Add Code

AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models

no code implementations • 29 Jan 2022 • Dongkuan Xu, Subhabrata Mukherjee, Xiaodong Liu, Debadeepta Dey, Wenhui Wang, Xiang Zhang, Ahmed Hassan Awadallah, Jianfeng Gao

Our framework AutoDistil addresses above challenges with the following steps: (a) Incorporates inductive bias and heuristics to partition Transformer search space into K compact sub-spaces (K=3 for typical student sizes of base, small and tiny); (b) Trains one SuperLM for each sub-space using task-agnostic objective (e. g., self-attention distillation) with weight-sharing of students; (c) Lightweight search for the optimal student without re-training.

Inductive Bias Knowledge Distillation +1

Paper
Add Code

InfoGCL: Information-Aware Graph Contrastive Learning

no code implementations • NeurIPS 2021 • Dongkuan Xu, Wei Cheng, Dongsheng Luo, Haifeng Chen, Xiang Zhang

The key point of this framework is to follow the Information Bottleneck principle to reduce the mutual information between contrastive parts while keeping task-relevant information intact at both the levels of the individual module and the entire framework so that the information loss during graph representation learning can be minimized.

Contrastive Learning Graph Classification +3

Paper
Add Code

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm

no code implementations • ACL 2022 • Shaoyi Huang, Dongkuan Xu, Ian E. H. Yen, Yijue Wang, Sung-En Chang, Bingbing Li, Shiyang Chen, Mimi Xie, Sanguthevar Rajasekaran, Hang Liu, Caiwen Ding

Conventional wisdom in pruning Transformer-based language models is that pruning reduces the model expressiveness and thus is more likely to underfit rather than overfit.

Knowledge Distillation

Paper
Add Code

Information-Aware Time Series Meta-Contrastive Learning

no code implementations • 29 Sep 2021 • Dongsheng Luo, Wei Cheng, Yingheng Wang, Dongkuan Xu, Jingchao Ni, Wenchao Yu, Xuchao Zhang, Yanchi Liu, Haifeng Chen, Xiang Zhang

How to find the desired augmentations of time series data that are meaningful for given contrastive learning tasks and datasets remains an open question.

Contrastive Learning Meta-Learning +4

Paper
Add Code

Data Augmentation with Adversarial Training for Cross-Lingual NLI

no code implementations • ACL 2021 • Xin Dong, Yaxin Zhu, Zuohui Fu, Dongkuan Xu, Gerard de Melo

Due to recent pretrained multilingual representation models, it has become feasible to exploit labeled data from one language to train a cross-lingual model that can then be applied to multiple new languages.

Cross-Lingual Natural Language Inference Data Augmentation

Paper
Add Code

Rethinking Network Pruning -- under the Pre-train and Fine-tune Paradigm

1 code implementation • NAACL 2021 • Dongkuan Xu, Ian E. H. Yen, Jinxi Zhao, Zhibin Xiao

In particular, common wisdom in pruning CNN states that sparse pruning technique compresses a model more than that obtained by reducing number of channels and layers (Elsen et al., 2020; Zhu and Gupta, 2017), while existing works on sparse pruning of BERT yields inferior results than its small-dense counterparts such as TinyBERT (Jiao et al., 2020).

Network Pruning

Paper
Code

Parameterized Explainer for Graph Neural Network

3 code implementations • NeurIPS 2020 • Dongsheng Luo, Wei Cheng, Dongkuan Xu, Wenchao Yu, Bo Zong, Haifeng Chen, Xiang Zhang

The unique explanation interpreting each instance independently is not sufficient to provide a global understanding of the learned GNN model, leading to a lack of generalizability and hindering it from being used in the inductive setting.

Graph Classification

141

Paper
Code

Leveraging Adversarial Training in Self-Learning for Cross-Lingual Text Classification

no code implementations • 29 Jul 2020 • Xin Dong, Yaxin Zhu, Yupeng Zhang, Zuohui Fu, Dongkuan Xu, Sen yang, Gerard de Melo

The resulting model then serves as a teacher to induce labels for unlabeled target language samples that can be used during further adversarial training, allowing us to gradually adapt our model to the target language.

General Classification intent-classification +4

Paper
Add Code

Longitudinal Deep Kernel Gaussian Process Regression

no code implementations • 24 May 2020 • Junjie Liang, Yanting Wu, Dongkuan Xu, Vasant Honavar

Specifically, L-DKGPR eliminates the need for ad hoc heuristics or trial and error using a novel adaptation of deep kernel learning that combines the expressive power of deep neural networks with the flexibility of non-parametric kernel methods.

Gaussian Processes regression +1

Paper
Add Code

How Do We Move: Modeling Human Movement with System Dynamics

no code implementations • 1 Mar 2020 • Hua Wei, Dongkuan Xu, Junjie Liang, Zhenhui Li

To the best of our knowledge, we are the first to learn to model the state transition of moving agents with system dynamics.

Imitation Learning

Paper
Add Code

LMLFM: Longitudinal Multi-Level Factorization Machine

1 code implementation • 11 Nov 2019 • Junjie Liang, Dongkuan Xu, Yiwei Sun, Vasant Honavar

However, the current state-of-the-art methods are unable to select the most predictive fixed effects and random effects from a large number of variables, while accounting for complex correlation structure in the data and non-linear interactions among the variables.

Variable Selection

Paper
Code

PIGMIL: Positive Instance Detection via Graph Updating for Multiple Instance Learning

no code implementations • 12 Dec 2016 • Dongkuan Xu, Jia Wu, Wei zhang, Yingjie Tian

To the end, we propose a positive instance detection via graph updating for multiple instance learning, called PIGMIL, to detect TPI accurately.

Multiple Instance Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.