Search Results for author: Zheng Qu

Found 14 papers, 1 papers with code

Dynamic N:M Fine-grained Structured Sparse Attention Mechanism

no code implementations • 28 Feb 2022 • Zhaodong Chen, Yuying Quan, Zheng Qu, Liu Liu, Yufei Ding, Yuan Xie

We evaluate the 1:2 and 2:4 sparsity under different configurations and achieve 1. 27~ 1. 89x speedups over the full-attention mechanism.

Paper
Add Code

Transformer Acceleration with Dynamic Sparse Attention

no code implementations • 21 Oct 2021 • Liu Liu, Zheng Qu, Zhaodong Chen, Yufei Ding, Yuan Xie

We demonstrate that the sparse patterns are dynamic, depending on input sequences.

Paper
Add Code

DFSSATTEN: Dynamic Fine-grained Structured Sparse Attention Mechanism

no code implementations • 29 Sep 2021 • Zhaodong Chen, Liu Liu, Yuying Quan, Zheng Qu, Yufei Ding, Yuan Xie

Transformers are becoming mainstream solutions for various tasks like NLP and Computer vision.

Paper
Add Code

H2Learn: High-Efficiency Learning Accelerator for High-Accuracy Spiking Neural Networks

no code implementations • 25 Jul 2021 • Ling Liang, Zheng Qu, Zhaodong Chen, Fengbin Tu, Yujie Wu, Lei Deng, Guoqi Li, Peng Li, Yuan Xie

Although spiking neural networks (SNNs) take benefits from the bio-plausible neural modeling, the low accuracy under the common local synaptic plasticity learning rules limits their application in many practical tasks.

Vocal Bursts Intensity Prediction

Paper
Add Code

SAGA with Arbitrary Sampling

no code implementations • 24 Jan 2019 • Xu Qian, Zheng Qu, Peter Richtárik

We study the problem of minimizing the average of a very large number of smooth functions, which is of key importance in training supervised learning models.

Paper
Add Code

Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling

no code implementations • 30 Dec 2015 • Zeyuan Allen-Zhu, Zheng Qu, Peter Richtárik, Yang Yuan

Accelerated coordinate descent is widely used in optimization due to its cheap per-iteration cost and scalability to large-scale problems.

Paper
Add Code

Quartz: Randomized Dual Coordinate Ascent with Arbitrary Sampling

no code implementations • NeurIPS 2015 • Zheng Qu, Peter Richtarik, Tong Zhang

We study the problem of minimizing the average of a large number of smooth convex functions penalized with a strongly convex regularizer.

Paper
Add Code

Stochastic Dual Coordinate Ascent with Adaptive Probabilities

no code implementations • 27 Feb 2015 • Dominik Csiba, Zheng Qu, Peter Richtárik

This paper introduces AdaSDCA: an adaptive variant of stochastic dual coordinate ascent (SDCA) for solving the regularized empirical risk minimization problems.

Paper
Add Code

SDNA: Stochastic Dual Newton Ascent for Empirical Risk Minimization

no code implementations • 8 Feb 2015 • Zheng Qu, Peter Richtárik, Martin Takáč, Olivier Fercoq

We propose a new algorithm for minimizing regularized empirical loss: Stochastic Dual Newton Ascent (SDNA).

Paper
Add Code

Coordinate Descent with Arbitrary Sampling II: Expected Separable Overapproximation

no code implementations • 27 Dec 2014 • Zheng Qu, Peter Richtárik

The design and complexity analysis of randomized coordinate descent methods, and in particular of variants which update a random subset (sampling) of coordinates in each iteration, depends on the notion of expected separable overapproximation (ESO).

Paper
Add Code

Coordinate Descent with Arbitrary Sampling I: Algorithms and Complexity

no code implementations • 27 Dec 2014 • Zheng Qu, Peter Richtárik

ALPHA is a remarkably flexible algorithm: in special cases, it reduces to deterministic and randomized methods such as gradient descent, coordinate descent, parallel coordinate descent and distributed coordinate descent -- both in nonaccelerated and accelerated variants.

Paper
Add Code

Randomized Dual Coordinate Ascent with Arbitrary Sampling

no code implementations • 21 Nov 2014 • Zheng Qu, Peter Richtárik, Tong Zhang

The distributed variant of Quartz is the first distributed SDCA-like method with an analysis for non-separable data.

Paper
Add Code

Fast Distributed Coordinate Descent for Non-Strongly Convex Losses

no code implementations • 21 May 2014 • Olivier Fercoq, Zheng Qu, Peter Richtárik, Martin Takáč

We propose an efficient distributed randomized coordinate descent method for minimizing regularized non-strongly convex loss functions.

Paper
Add Code

Squaring-Up Method In the Presence of Transmission Zeros

1 code implementation • 5 Oct 2013 • Zheng Qu, Daniel Wiese, Anuradha M. Annaswamy, Eugene Lavretsky

This paper presents a method to square up a generic MIMO system that already possesses transmission zeros.

Optimization and Control

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.