Search Results for author: Dongkuan Xu

Found 33 papers, 12 papers with code

Students' Perceptions and Preferences of Generative Artificial Intelligence Feedback for Programming

no code implementations17 Dec 2023 Zhengdong Zhang, Zihan Dong, Yang Shi, Noboru Matsuda, Thomas Price, Dongkuan Xu

This study demonstrated that ChatGPT could generate Java programming assignment feedback that students perceived as formative.


FP8-BERT: Post-Training Quantization for Transformer

no code implementations10 Dec 2023 Jianwei Li, Tianchi Zhang, Ian En-Hsu Yen, Dongkuan Xu

Transformer-based models, such as BERT, have been widely applied in a wide range of natural language processing tasks.


Breaking through Deterministic Barriers: Randomized Pruning Mask Generation and Selection

no code implementations19 Oct 2023 Jianwei Li, Weizhi Gao, Qi Lei, Dongkuan Xu

It is widely acknowledged that large and sparse models have higher accuracy than small and dense models under the same model size constraints.

Towards Robust Pruning: An Adaptive Knowledge-Retention Pruning Strategy for Language Models

no code implementations19 Oct 2023 Jianwei Li, Qi Lei, Wei Cheng, Dongkuan Xu

The pruning objective has recently extended beyond accuracy and sparsity to robustness in language models.

DeeDiff: Dynamic Uncertainty-Aware Early Exiting for Accelerating Diffusion Model Generation

no code implementations29 Sep 2023 Shengkun Tang, Yaqing Wang, Caiwen Ding, Yi Liang, Yao Li, Dongkuan Xu

In this work, we propose DeeDiff, an early exiting framework that adaptively allocates computation resources in each sampling step to improve the generation efficiency of diffusion models.


Gentopia: A Collaborative Platform for Tool-Augmented LLMs

1 code implementation8 Aug 2023 Binfeng Xu, Xukun Liu, Hua Shen, Zeyu Han, Yuhan Li, Murong Yue, Zhiyuan Peng, Yuchen Liu, Ziyu Yao, Dongkuan Xu

We present gentopia, an ALM framework enabling flexible customization of agents through simple configurations, seamlessly integrating various language models, task formats, prompting modules, and plugins into a unified paradigm.

Rethinking Data Distillation: Do Not Overlook Calibration

1 code implementation ICCV 2023 Dongyao Zhu, Bowen Lei, Jie Zhang, Yanbo Fang, Ruqi Zhang, Yiqun Xie, Dongkuan Xu

Neural networks trained on distilled data often produce over-confident output and require correction by calibration methods.

Towards Reliable Rare Category Analysis on Graphs via Individual Calibration

1 code implementation19 Jul 2023 Longfeng Wu, Bowen Lei, Dongkuan Xu, Dawei Zhou

In particular, to quantify the uncertainties in RCA, we develop a node-level uncertainty quantification algorithm to model the overlapping support regions with high uncertainty; to handle the rarity of minority classes in miscalibration calculation, we generalize the distribution-based calibration metric to the instance level and propose the first individual calibration measurement on graphs named Expected Individual Calibration Error (EICE).

Fraud Detection Network Intrusion Detection +1

AutoST: Training-free Neural Architecture Search for Spiking Transformers

no code implementations1 Jul 2023 Ziqing Wang, Qidong Zhao, Jinku Cui, Xu Liu, Dongkuan Xu

To address these limitations, we introduce AutoST, a training-free NAS method for Spiking Transformers, to rapidly identify high-performance Spiking Transformer architectures.

Neural Architecture Search

ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models

2 code implementations23 May 2023 Binfeng Xu, Zhiyuan Peng, Bowen Lei, Subhabrata Mukherjee, Yuchen Liu, Dongkuan Xu

Augmented Language Models (ALMs) blend the reasoning capabilities of Large Language Models (LLMs) with tools that allow for knowledge retrieval and action execution.


Neurogenesis Dynamics-inspired Spiking Neural Network Training Acceleration

no code implementations24 Apr 2023 Shaoyi Huang, Haowen Fang, Kaleel Mahmood, Bowen Lei, Nuo Xu, Bin Lei, Yue Sun, Dongkuan Xu, Wujie Wen, Caiwen Ding

Experimental results show that NDSNN achieves up to 20. 52\% improvement in accuracy on Tiny-ImageNet using ResNet-19 (with a sparsity of 99\%) as compared to other SOTA methods (e. g., Lottery Ticket Hypothesis (LTH), SET-SNN, RigL-SNN).

Time Series Contrastive Learning with Information-Aware Augmentations

1 code implementation21 Mar 2023 Dongsheng Luo, Wei Cheng, Yingheng Wang, Dongkuan Xu, Jingchao Ni, Wenchao Yu, Xuchao Zhang, Yanchi Liu, Yuncong Chen, Haifeng Chen, Xiang Zhang

A key component of contrastive learning is to select appropriate augmentations imposing some priors to construct feasible positive samples, such that an encoder can be trained to learn robust and discriminative representations.

Contrastive Learning Open-Ended Question Answering +2

Efficient Informed Proposals for Discrete Distributions via Newton's Series Approximation

no code implementations27 Feb 2023 Yue Xiang, Dongyao Zhu, Bowen Lei, Dongkuan Xu, Ruqi Zhang

Gradients have been exploited in proposal distributions to accelerate the convergence of Markov chain Monte Carlo algorithms on discrete distributions.

Efficient Exploration Extractive Text Summarization +2

Calibrating the Rigged Lottery: Making All Tickets Reliable

1 code implementation18 Feb 2023 Bowen Lei, Ruqi Zhang, Dongkuan Xu, Bani Mallick

Previous research has shown that deep neural networks tend to be over-confident, and we find that sparse training exacerbates this problem.

Decision Making

Balance is Essence: Accelerating Sparse Training via Adaptive Gradient Correction

1 code implementation9 Jan 2023 Bowen Lei, Dongkuan Xu, Ruqi Zhang, Shuren He, Bani K. Mallick

To accelerate and stabilize the convergence of sparse training, we analyze the gradient changes and develop an adaptive gradient correction method.

Accelerating Dataset Distillation via Model Augmentation

2 code implementations CVPR 2023 Lei Zhang, Jie Zhang, Bowen Lei, Subhabrata Mukherjee, Xiang Pan, Bo Zhao, Caiwen Ding, Yao Li, Dongkuan Xu

Dataset Distillation (DD), a newly emerging field, aims at generating much smaller but efficient synthetic training datasets from large ones.

Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off

no code implementations30 Nov 2022 Shaoyi Huang, Bowen Lei, Dongkuan Xu, Hongwu Peng, Yue Sun, Mimi Xie, Caiwen Ding

We further design an acquisition function and provide the theoretical guarantees for the proposed method and clarify its convergence property.

You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model

1 code implementation CVPR 2023 Shengkun Tang, Yaqing Wang, Zhenglun Kong, Tianchi Zhang, Yao Li, Caiwen Ding, Yanzhi Wang, Yi Liang, Dongkuan Xu

To handle this challenge, we propose a novel early exiting strategy for unified visual language models, which allows dynamically skip the layers in encoder and decoder simultaneously in term of input layer-wise similarities with multiple times of early exiting, namely \textbf{MuE}.

Language Modelling

S4: a High-sparsity, High-performance AI Accelerator

no code implementations16 Jul 2022 Ian En-Hsu Yen, Zhibin Xiao, Dongkuan Xu

And the degree of sparsity one can exploit has become higher as larger model sizes have been considered along with the trend of pre-training giant models.

Quantization Vocal Bursts Intensity Prediction

AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models

no code implementations29 Jan 2022 Dongkuan Xu, Subhabrata Mukherjee, Xiaodong Liu, Debadeepta Dey, Wenhui Wang, Xiang Zhang, Ahmed Hassan Awadallah, Jianfeng Gao

Our framework AutoDistil addresses above challenges with the following steps: (a) Incorporates inductive bias and heuristics to partition Transformer search space into K compact sub-spaces (K=3 for typical student sizes of base, small and tiny); (b) Trains one SuperLM for each sub-space using task-agnostic objective (e. g., self-attention distillation) with weight-sharing of students; (c) Lightweight search for the optimal student without re-training.

Inductive Bias Knowledge Distillation +1

InfoGCL: Information-Aware Graph Contrastive Learning

no code implementations NeurIPS 2021 Dongkuan Xu, Wei Cheng, Dongsheng Luo, Haifeng Chen, Xiang Zhang

The key point of this framework is to follow the Information Bottleneck principle to reduce the mutual information between contrastive parts while keeping task-relevant information intact at both the levels of the individual module and the entire framework so that the information loss during graph representation learning can be minimized.

Contrastive Learning Graph Classification +3

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm

no code implementations ACL 2022 Shaoyi Huang, Dongkuan Xu, Ian E. H. Yen, Yijue Wang, Sung-En Chang, Bingbing Li, Shiyang Chen, Mimi Xie, Sanguthevar Rajasekaran, Hang Liu, Caiwen Ding

Conventional wisdom in pruning Transformer-based language models is that pruning reduces the model expressiveness and thus is more likely to underfit rather than overfit.

Knowledge Distillation

Information-Aware Time Series Meta-Contrastive Learning

no code implementations29 Sep 2021 Dongsheng Luo, Wei Cheng, Yingheng Wang, Dongkuan Xu, Jingchao Ni, Wenchao Yu, Xuchao Zhang, Yanchi Liu, Haifeng Chen, Xiang Zhang

How to find the desired augmentations of time series data that are meaningful for given contrastive learning tasks and datasets remains an open question.

Contrastive Learning Meta-Learning +4

Data Augmentation with Adversarial Training for Cross-Lingual NLI

no code implementations ACL 2021 Xin Dong, Yaxin Zhu, Zuohui Fu, Dongkuan Xu, Gerard de Melo

Due to recent pretrained multilingual representation models, it has become feasible to exploit labeled data from one language to train a cross-lingual model that can then be applied to multiple new languages.

Cross-Lingual Natural Language Inference Data Augmentation

Rethinking Network Pruning -- under the Pre-train and Fine-tune Paradigm

1 code implementation NAACL 2021 Dongkuan Xu, Ian E. H. Yen, Jinxi Zhao, Zhibin Xiao

In particular, common wisdom in pruning CNN states that sparse pruning technique compresses a model more than that obtained by reducing number of channels and layers (Elsen et al., 2020; Zhu and Gupta, 2017), while existing works on sparse pruning of BERT yields inferior results than its small-dense counterparts such as TinyBERT (Jiao et al., 2020).

Network Pruning

Parameterized Explainer for Graph Neural Network

3 code implementations NeurIPS 2020 Dongsheng Luo, Wei Cheng, Dongkuan Xu, Wenchao Yu, Bo Zong, Haifeng Chen, Xiang Zhang

The unique explanation interpreting each instance independently is not sufficient to provide a global understanding of the learned GNN model, leading to a lack of generalizability and hindering it from being used in the inductive setting.

Graph Classification

Leveraging Adversarial Training in Self-Learning for Cross-Lingual Text Classification

no code implementations29 Jul 2020 Xin Dong, Yaxin Zhu, Yupeng Zhang, Zuohui Fu, Dongkuan Xu, Sen yang, Gerard de Melo

The resulting model then serves as a teacher to induce labels for unlabeled target language samples that can be used during further adversarial training, allowing us to gradually adapt our model to the target language.

General Classification intent-classification +4

Longitudinal Deep Kernel Gaussian Process Regression

no code implementations24 May 2020 Junjie Liang, Yanting Wu, Dongkuan Xu, Vasant Honavar

Specifically, L-DKGPR eliminates the need for ad hoc heuristics or trial and error using a novel adaptation of deep kernel learning that combines the expressive power of deep neural networks with the flexibility of non-parametric kernel methods.

Gaussian Processes regression +1

How Do We Move: Modeling Human Movement with System Dynamics

no code implementations1 Mar 2020 Hua Wei, Dongkuan Xu, Junjie Liang, Zhenhui Li

To the best of our knowledge, we are the first to learn to model the state transition of moving agents with system dynamics.

Imitation Learning

LMLFM: Longitudinal Multi-Level Factorization Machine

1 code implementation11 Nov 2019 Junjie Liang, Dongkuan Xu, Yiwei Sun, Vasant Honavar

However, the current state-of-the-art methods are unable to select the most predictive fixed effects and random effects from a large number of variables, while accounting for complex correlation structure in the data and non-linear interactions among the variables.

Variable Selection

PIGMIL: Positive Instance Detection via Graph Updating for Multiple Instance Learning

no code implementations12 Dec 2016 Dongkuan Xu, Jia Wu, Wei zhang, Yingjie Tian

To the end, we propose a positive instance detection via graph updating for multiple instance learning, called PIGMIL, to detect TPI accurately.

Multiple Instance Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.