Search Results for author: Zheng Qu

Found 14 papers, 1 papers with code

Dynamic N:M Fine-grained Structured Sparse Attention Mechanism

no code implementations28 Feb 2022 Zhaodong Chen, Yuying Quan, Zheng Qu, Liu Liu, Yufei Ding, Yuan Xie

We evaluate the 1:2 and 2:4 sparsity under different configurations and achieve 1. 27~ 1. 89x speedups over the full-attention mechanism.

Transformer Acceleration with Dynamic Sparse Attention

no code implementations21 Oct 2021 Liu Liu, Zheng Qu, Zhaodong Chen, Yufei Ding, Yuan Xie

We demonstrate that the sparse patterns are dynamic, depending on input sequences.

DFSSATTEN: Dynamic Fine-grained Structured Sparse Attention Mechanism

no code implementations29 Sep 2021 Zhaodong Chen, Liu Liu, Yuying Quan, Zheng Qu, Yufei Ding, Yuan Xie

Transformers are becoming mainstream solutions for various tasks like NLP and Computer vision.

H2Learn: High-Efficiency Learning Accelerator for High-Accuracy Spiking Neural Networks

no code implementations25 Jul 2021 Ling Liang, Zheng Qu, Zhaodong Chen, Fengbin Tu, Yujie Wu, Lei Deng, Guoqi Li, Peng Li, Yuan Xie

Although spiking neural networks (SNNs) take benefits from the bio-plausible neural modeling, the low accuracy under the common local synaptic plasticity learning rules limits their application in many practical tasks.

Vocal Bursts Intensity Prediction

SAGA with Arbitrary Sampling

no code implementations24 Jan 2019 Xu Qian, Zheng Qu, Peter Richtárik

We study the problem of minimizing the average of a very large number of smooth functions, which is of key importance in training supervised learning models.

Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling

no code implementations30 Dec 2015 Zeyuan Allen-Zhu, Zheng Qu, Peter Richtárik, Yang Yuan

Accelerated coordinate descent is widely used in optimization due to its cheap per-iteration cost and scalability to large-scale problems.

Quartz: Randomized Dual Coordinate Ascent with Arbitrary Sampling

no code implementations NeurIPS 2015 Zheng Qu, Peter Richtarik, Tong Zhang

We study the problem of minimizing the average of a large number of smooth convex functions penalized with a strongly convex regularizer.

Stochastic Dual Coordinate Ascent with Adaptive Probabilities

no code implementations27 Feb 2015 Dominik Csiba, Zheng Qu, Peter Richtárik

This paper introduces AdaSDCA: an adaptive variant of stochastic dual coordinate ascent (SDCA) for solving the regularized empirical risk minimization problems.

SDNA: Stochastic Dual Newton Ascent for Empirical Risk Minimization

no code implementations8 Feb 2015 Zheng Qu, Peter Richtárik, Martin Takáč, Olivier Fercoq

We propose a new algorithm for minimizing regularized empirical loss: Stochastic Dual Newton Ascent (SDNA).

Coordinate Descent with Arbitrary Sampling II: Expected Separable Overapproximation

no code implementations27 Dec 2014 Zheng Qu, Peter Richtárik

The design and complexity analysis of randomized coordinate descent methods, and in particular of variants which update a random subset (sampling) of coordinates in each iteration, depends on the notion of expected separable overapproximation (ESO).

Coordinate Descent with Arbitrary Sampling I: Algorithms and Complexity

no code implementations27 Dec 2014 Zheng Qu, Peter Richtárik

ALPHA is a remarkably flexible algorithm: in special cases, it reduces to deterministic and randomized methods such as gradient descent, coordinate descent, parallel coordinate descent and distributed coordinate descent -- both in nonaccelerated and accelerated variants.

Randomized Dual Coordinate Ascent with Arbitrary Sampling

no code implementations21 Nov 2014 Zheng Qu, Peter Richtárik, Tong Zhang

The distributed variant of Quartz is the first distributed SDCA-like method with an analysis for non-separable data.

Fast Distributed Coordinate Descent for Non-Strongly Convex Losses

no code implementations21 May 2014 Olivier Fercoq, Zheng Qu, Peter Richtárik, Martin Takáč

We propose an efficient distributed randomized coordinate descent method for minimizing regularized non-strongly convex loss functions.

Squaring-Up Method In the Presence of Transmission Zeros

1 code implementation5 Oct 2013 Zheng Qu, Daniel Wiese, Anuradha M. Annaswamy, Eugene Lavretsky

This paper presents a method to square up a generic MIMO system that already possesses transmission zeros.

Optimization and Control

Cannot find the paper you are looking for? You can Submit a new open access paper.