Search Results for author: Yufei Ding

Found 35 papers, 10 papers with code

LightSeq2: Accelerated Training for Transformer-based Models on GPUs

1 code implementation • 12 Oct 2021 • Xiaohui Wang, Yang Wei, Ying Xiong, Guyue Huang, Xian Qian, Yufei Ding, Mingxuan Wang, Lei LI

In this paper, we present LightSeq2, a system to accelerate training for a general family of Transformer models on GPUs.

Machine Translation Speech Recognition +1

3,087

Paper
Code

GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs

1 code implementation • 11 Jun 2020 • Yuke Wang, Boyuan Feng, Gushu Li, Shuangchen Li, Lei Deng, Yuan Xie, Yufei Ding

As the emerging trend of graph-based deep learning, Graph Neural Networks (GNNs) excel for their capability to generate high-quality node feature vectors (embeddings).

Distributed, Parallel, and Cluster Computing

Paper
Code

TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs

2 code implementations • 3 Dec 2021 • yuke wang, Boyuan Feng, Zheng Wang, Guyue Huang, Yufei Ding

Recently, graph neural networks (GNNs), as the backbone of graph-based machine learning, demonstrate great success in various domains (e. g., e-commerce).

Translation

Paper
Code

MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Multi-GPU Platforms

1 code implementation • 14 Sep 2022 • yuke wang, Boyuan Feng, Zheng Wang, Tong Geng, Kevin Barker, Ang Li, Yufei Ding

For irregularly sparse and fine-grained GNN workloads, such solutions miss the opportunity to jointly schedule/optimize the computation and communication operations for high-performance delivery.

Layout Design Management

Paper
Code

APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores

1 code implementation • 23 Jun 2021 • Boyuan Feng, yuke wang, Tong Geng, Ang Li, Yufei Ding

Over the years, accelerating neural networks with quantization has been widely studied.

Quantization

Paper
Code

DSXplore: Optimizing Convolutional Neural Networks via Sliding-Channel Convolutions

1 code implementation • 4 Jan 2021 • yuke wang, Boyuan Feng, Yufei Ding

It also brings profound impact to improve the applicability of the compute- and memory-intensive CNNs to a broad range of applications, such as mobile devices, which are generally short of computation power and memory.

Paper
Code

Faith: An Efficient Framework for Transformer Verification on GPUs

1 code implementation • 23 Sep 2022 • Boyuan Feng, Tianqi Tang, yuke wang, Zhaodong Chen, Zheng Wang, Shu Yang, Yuan Xie, Yufei Ding

In this paper, we propose Faith, an efficient framework for transformer verification on GPUs.

Sentence

Paper
Code

Domain-Adversarial Multi-Task Framework for Novel Therapeutic Property Prediction of Compounds

1 code implementation • 28 Sep 2018 • Lingwei Xie, Song He, Shu Yang, Boyuan Feng, Kun Wan, Zhongnan Zhang, Xiaochen Bo, Yufei Ding

In this paper, we propose a novel domain-adversarial multi-task framework for integrating shared knowledge from multiple domains.

Property Prediction

Paper
Code

Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization

1 code implementation • 3 Nov 2019 • Lei Deng, Yujie Wu, Yifan Hu, Ling Liang, Guoqi Li, Xing Hu, Yufei Ding, Peng Li, Yuan Xie

As well known, the huge memory and compute costs of both artificial neural networks (ANNs) and spiking neural networks (SNNs) greatly hinder their deployment on edge devices with high efficiency.

Model Compression Quantization

Paper
Code

On Adversarial Robustness of Point Cloud Semantic Segmentation

1 code implementation • 11 Dec 2021 • Jiacen Xu, Zhe Zhou, Boyuan Feng, Yufei Ding, Zhou Li

As such, we present a comparative study of PCSS robustness.

Adversarial Robustness Autonomous Driving +3

Paper
Code

Dynamic Sparse Graph for Efficient Deep Learning

no code implementations • ICLR 2019 • Liu Liu, Lei Deng, Xing Hu, Maohua Zhu, Guoqi Li, Yufei Ding, Yuan Xie

We propose to execute deep neural networks (DNNs) with dynamic and sparse graph (DSG) structure for compressive memory and accelerative execution during both training and inference.

Dimensionality Reduction

Paper
Add Code

Reconciling Feature-Reuse and Overfitting in DenseNet with Specialized Dropout

no code implementations • ICLR 2019 • Kun Wan, Boyuan Feng, Lingwei Xie, Yufei Ding

The insights attained here could potentially be applied as a general approach for boosting the accuracy of other CNN models with similar nonlinear connections.

Paper
Add Code

Penetrating the Fog: the Path to Efficient CNN Models

no code implementations • ICLR 2019 • Kun Wan, Boyuan Feng, Shu Yang, Yufei Ding

In this paper, we are the first in the field to consider how to craft an effective sparse kernel design by eliminating the large design space.

Paper
Add Code

PCNN: Environment Adaptive Model Without Finetuning

no code implementations • ICLR 2019 • Boyuan Feng, Kun Wan, Shu Yang, Yufei Ding

Convolutional Neural Networks (CNNs) have achieved tremendous success for many computer vision tasks, which shows a promising perspective of deploying CNNs on mobile platforms.

Transfer Learning

Paper
Add Code

AccD: A Compiler-based Framework for Accelerating Distance-related Algorithms on CPU-FPGA Platforms

no code implementations • 26 Aug 2019 • Yuke Wang, Boyuan Feng, Gushu Li, Lei Deng, Yuan Xie, Yufei Ding

As a promising solution to boost the performance of distance-related algorithms (e. g., K-means and KNN), FPGA-based acceleration attracts lots of attention, but also comes with numerous challenges.

Distributed, Parallel, and Cluster Computing Programming Languages

Paper
Add Code

Neural Network Model Extraction Attacks in Edge Devices by Hearing Architectural Hints

no code implementations • 10 Mar 2019 • Xing Hu, Ling Liang, Lei Deng, Shuangchen Li, Xinfeng Xie, Yu Ji, Yufei Ding, Chang Liu, Timothy Sherwood, Yuan Xie

As neural networks continue their reach into nearly every aspect of software operations, the details of those networks become an increasingly sensitive subject.

Cryptography and Security Hardware Architecture

Paper
Add Code

Proq: Projection-based Runtime Assertions for Debugging on a Quantum Computer

no code implementations • 28 Nov 2019 • Gushu Li, Li Zhou, Nengkun Yu, Yufei Ding, Mingsheng Ying, Yuan Xie

In this paper, we propose Proq, a runtime assertion scheme for testing and debugging quantum programs on a quantum computer.

Paper
Add Code

Exploring Adversarial Attack in Spiking Neural Networks with Spike-Compatible Gradient

no code implementations • 1 Jan 2020 • Ling Liang, Xing Hu, Lei Deng, Yujie Wu, Guoqi Li, Yufei Ding, Peng Li, Yuan Xie

Recently, backpropagation through time inspired learning algorithms are widely introduced into SNNs to improve the performance, which brings the possibility to attack the models accurately given Spatio-temporal gradient maps.

Adversarial Attack

Paper
Add Code

Weighted-Sampling Audio Adversarial Example Attack

no code implementations • 26 Jan 2019 • Xiaolei Liu, Xiaosong Zhang, Kun Wan, Qingxin Zhu, Yufei Ding

In this paper, we propose~\textit{weighted-sampling audio adversarial examples}, focusing on the numbers and the weights of distortion to reinforce the attack.

Adversarial Attack Automatic Speech Recognition +3

Paper
Add Code

SGQuant: Squeezing the Last Bit on Graph Neural Networks with Specialized Quantization

no code implementations • 9 Jul 2020 • Boyuan Feng, yuke wang, Xu Li, Shu Yang, Xueqiao Peng, Yufei Ding

With the increasing popularity of graph-based learning, Graph Neural Networks (GNNs) win lots of attention from the research and industry field because of their high accuracy.

Quantization

Paper
Add Code

Boosting Deep Neural Network Efficiency with Dual-Module Inference

no code implementations • ICML 2020 • Liu Liu, Lei Deng, Zhaodong Chen, yuke wang, Shuangchen Li, Jingwei Zhang, Yihua Yang, Zhenyu Gu, Yufei Ding, Yuan Xie

Using Deep Neural Networks (DNNs) in machine learning tasks is promising in delivering high-quality results but challenging to meet stringent latency requirements and energy constraints because of the memory-bound and the compute-bound execution pattern of DNNs.

Paper
Add Code

An Efficient Quantitative Approach for Optimizing Convolutional Neural Networks

no code implementations • 11 Sep 2020 • Yuke Wang, Boyuan Feng, Xueqiao Peng, Yufei Ding

To clear these hurdles, we propose 3D-Receptive Field (3DRF), an explainable and easy-to-compute metric, to estimate the quality of a CNN architecture and guide the search process of designs.

Image Classification object-detection +1

Paper
Add Code

Scalable Adversarial Attack on Graph Neural Networks with Alternating Direction Method of Multipliers

no code implementations • 22 Sep 2020 • Boyuan Feng, yuke wang, Xu Li, Yufei Ding

Graph neural networks (GNNs) have achieved high performance in analyzing graph-structured data and have been widely deployed in safety-critical areas, such as finance and autonomous driving.

Adversarial Attack Autonomous Driving

Paper
Add Code

Uncertainty-aware Attention Graph Neural Network for Defending Adversarial Attacks

no code implementations • 22 Sep 2020 • Boyuan Feng, Yuke Wang, Zheng Wang, Yufei Ding

With the increasing popularity of graph-based learning, graph neural networks (GNNs) emerge as the essential tool for gaining insights from graphs.

Paper
Add Code

Rubik: A Hierarchical Architecture for Efficient Graph Learning

no code implementations • 26 Sep 2020 • Xiaobing Chen, yuke wang, Xinfeng Xie, Xing Hu, Abanti Basak, Ling Liang, Mingyu Yan, Lei Deng, Yufei Ding, Zidong Du, Yunji Chen, Yuan Xie

Graph convolutional network (GCN) emerges as a promising direction to learn the inductive representation in graph data commonly used in widespread applications, such as E-commerce, social networks, and knowledge graphs.

Hardware Architecture

Paper
Add Code

MPU: Towards Bandwidth-abundant SIMT Processor via Near-bank Computing

no code implementations • 11 Mar 2021 • Xinfeng Xie, Peng Gu, Yufei Ding, Dimin Niu, Hongzhong Zheng, Yuan Xie

For general purpose scenarios, lightweight hardware designs for diverse data paths, architectural supports for the SIMT programming model, and end-to-end software optimizations remain challenging.

Hardware Architecture

Paper
Add Code

DFSSATTEN: Dynamic Fine-grained Structured Sparse Attention Mechanism

no code implementations • 29 Sep 2021 • Zhaodong Chen, Liu Liu, Yuying Quan, Zheng Qu, Yufei Ding, Yuan Xie

Transformers are becoming mainstream solutions for various tasks like NLP and Computer vision.

Paper
Add Code

Understanding GNN Computational Graph: A Coordinated Computation, IO, and Memory Perspective

no code implementations • 18 Oct 2021 • Hengrui Zhang, Zhongming Yu, Guohao Dai, Guyue Huang, Yufei Ding, Yuan Xie, Yu Wang

The same data are propagated through the graph structure to perform the same neural operation multiple times in GNNs, leading to redundant computation which accounts for 92. 4% of total operators.

Paper
Add Code

Transformer Acceleration with Dynamic Sparse Attention

no code implementations • 21 Oct 2021 • Liu Liu, Zheng Qu, Zhaodong Chen, Yufei Ding, Yuan Xie

We demonstrate that the sparse patterns are dynamic, depending on input sequences.

Paper
Add Code

Dual-module Inference for Efficient Recurrent Neural Networks

no code implementations • 25 Sep 2019 • Liu Liu, Lei Deng, Shuangchen Li, Jingwei Zhang, Yihua Yang, Zhenyu Gu, Yufei Ding, Yuan Xie

Using Recurrent Neural Networks (RNNs) in sequence modeling tasks is promising in delivering high-quality results but challenging to meet stringent latency requirements because of the memory-bound execution pattern of RNNs.

Paper
Add Code

Mitigating Noise-Induced Gradient Vanishing in Variational Quantum Algorithm Training

no code implementations • 25 Nov 2021 • Anbang Wu, Gushu Li, Yufei Ding, Yuan Xie

In this paper, we propose a novel training scheme to mitigate such noise-induced gradient vanishing.

Paper
Add Code

Towards Efficient Ansatz Architecture for Variational Quantum Algorithms

no code implementations • 26 Nov 2021 • Anbang Wu, Gushu Li, yuke wang, Boyuan Feng, Yufei Ding, Yuan Xie

In this paper, we propose a novel training scheme to mitigate such noise-induced gradient vanishing.

Paper
Add Code

Dynamic N:M Fine-grained Structured Sparse Attention Mechanism

no code implementations • 28 Feb 2022 • Zhaodong Chen, Yuying Quan, Zheng Qu, Liu Liu, Yufei Ding, Yuan Xie

We evaluate the 1:2 and 2:4 sparsity under different configurations and achieve 1. 27~ 1. 89x speedups over the full-attention mechanism.

Paper
Add Code

A Geometrical Approach to Evaluate the Adversarial Robustness of Deep Neural Networks

no code implementations • 10 Oct 2023 • Yang Wang, Bo Dong, Ke Xu, Haiyin Piao, Yufei Ding, BaoCai Yin, Xin Yang

Hence, given different inputs, it requires different time for converging to an adversarial sample.

Adversarial Robustness

Paper
Add Code

Tackling the Qubit Mapping Problem for NISQ-Era Quantum Devices

no code implementations • International Conference on Architectural Support for Programming Languages and Operating Systems 2019 • Gushu Li, Yufei Ding, Yuan Xie

Due to little consideration in the hardware constraints, e. g., limited connections between physical qubits to enable two-qubit gates, most quantum algorithms cannot be directly executed on the Noisy Intermediate-Scale Quantum (NISQ) devices.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.