Search Results for author: Zhouyuan Huo

Found 31 papers, 4 papers with code

Decoupled Parallel Backpropagation with Convergence Guarantee

3 code implementations ICML 2018 Zhouyuan Huo, Bin Gu, Qian Yang, Heng Huang

The backward locking in backpropagation algorithm constrains us from updating network layers in parallel and fully leveraging the computing resources.

Accelerated Method for Stochastic Composition Optimization with Nonsmooth Regularization

no code implementations10 Nov 2017 Zhouyuan Huo, Bin Gu, Ji Liu, Heng Huang

To the best of our knowledge, our method admits the fastest convergence rate for stochastic composition optimization: for strongly convex composition problem, our algorithm is proved to admit linear convergence; for general composition problem, our algorithm significantly improves the state-of-the-art convergence rate from $O(T^{-1/2})$ to $O((n_1+n_2)^{{2}/{3}}T^{-1})$.

Management reinforcement-learning +1

Distributed Asynchronous Dual Free Stochastic Dual Coordinate Ascent

no code implementations29 May 2016 Zhouyuan Huo, Heng Huang

Our method does not need the dual formulation of the target problem in the optimization.

Distributed Optimization

Asynchronous Stochastic Gradient Descent with Variance Reduction for Non-Convex Optimization

no code implementations12 Apr 2016 Zhouyuan Huo, Heng Huang

We provide the first theoretical analysis on the convergence rate of the asynchronous stochastic variance reduced gradient (SVRG) descent algorithm on non-convex optimization.

Inexact Proximal Gradient Methods for Non-convex and Non-smooth Optimization

no code implementations18 Dec 2016 Bin Gu, De Wang, Zhouyuan Huo, Heng Huang

The theoretical results show that our inexact proximal gradient algorithms can have the same convergence rates as the ones of exact proximal gradient algorithms in the non-convex setting.

BIG-bench Machine Learning

Zeroth-order Asynchronous Doubly Stochastic Algorithm with Variance Reduction

no code implementations5 Dec 2016 Bin Gu, Zhouyuan Huo, Heng Huang

The convergence rate of existing asynchronous doubly stochastic zeroth order algorithms is $O(\frac{1}{\sqrt{T}})$ (also for the sequential stochastic zeroth-order optimization algorithms).

Asynchronous Stochastic Block Coordinate Descent with Variance Reduction

no code implementations29 Oct 2016 Bin Gu, Zhouyuan Huo, Heng Huang

In this paper, we focus on a composite objective function consisting of a smooth convex function $f$ and a block separable convex function, which widely exists in machine learning and computer vision.

Stochastic Optimization

Decoupled Asynchronous Proximal Stochastic Gradient Descent with Variance Reduction

no code implementations22 Sep 2016 Zhouyuan Huo, Bin Gu, Heng Huang

In this paper, we propose a faster method, decoupled asynchronous proximal stochastic variance reduced gradient descent method (DAP-SVRG).

Training Neural Networks Using Features Replay

no code implementations NeurIPS 2018 Zhouyuan Huo, Bin Gu, Heng Huang

Training a neural network using backpropagation algorithm requires passing error gradients sequentially through the network.

Ego-Downward and Ambient Video based Person Location Association

no code implementations2 Dec 2018 Liang Yang, Hao Jiang, Jizhong Xiao, Zhouyuan Huo

To provide a possible solution to this problem, this paper proposes a camera system with both ego-downward and third-static view to perform localization and tracking in a learning approach.

Faster Derivative-Free Stochastic Algorithm for Shared Memory Machines

no code implementations ICML 2018 Bin Gu, Zhouyuan Huo, Cheng Deng, Heng Huang

Asynchronous parallel stochastic gradient optimization has been playing a pivotal role to solve large-scale machine learning problems in big data applications.

Ensemble Learning

Faster Gradient-Free Proximal Stochastic Methods for Nonconvex Nonsmooth Optimization

no code implementations16 Feb 2019 Feihu Huang, Bin Gu, Zhouyuan Huo, Songcan Chen, Heng Huang

Proximal gradient method has been playing an important role to solve many machine learning tasks, especially for the nonsmooth problems.

BIG-bench Machine Learning

On the Acceleration of Deep Learning Model Parallelism with Staleness

no code implementations CVPR 2020 An Xu, Zhouyuan Huo, Heng Huang

Training the deep convolutional neural network for computer vision problems is slow and inefficient, especially when it is large and distributed across multiple devices.

Straggler-Agnostic and Communication-Efficient Distributed Primal-Dual Algorithm for High-Dimensional Data Mining

no code implementations9 Oct 2019 Zhouyuan Huo, Heng Huang

Recently, reducing communication time between machines becomes the main focus of distributed data mining.

Large Batch Training Does Not Need Warmup

no code implementations4 Feb 2020 Zhouyuan Huo, Bin Gu, Heng Huang

Training deep neural networks using a large batch size has shown promising results and benefits many real-world applications.

Faster On-Device Training Using New Federated Momentum Algorithm

no code implementations6 Feb 2020 Zhouyuan Huo, Qian Yang, Bin Gu, Lawrence Carin. Heng Huang

Mobile crowdsensing has gained significant attention in recent years and has become a critical paradigm for emerging Internet of Things applications.

Federated Learning

Optimal Gradient Quantization Condition for Communication-Efficient Distributed Training

no code implementations25 Feb 2020 An Xu, Zhouyuan Huo, Heng Huang

The communication of gradients is costly for training deep neural networks with multiple devices in computer vision applications.

Quantization

Step-Ahead Error Feedback for Distributed Training with Compressed Gradient

no code implementations13 Aug 2020 An Xu, Zhouyuan Huo, Heng Huang

Both our theoretical and empirical results show that our new methods can handle the "gradient mismatch" problem.

Privacy-Preserving Asynchronous Federated Learning Algorithms for Multi-Party Vertically Collaborative Learning

no code implementations14 Aug 2020 Bin Gu, An Xu, Zhouyuan Huo, Cheng Deng, Heng Huang

To the best of our knowledge, AFSGD-VP and its SVRG and SAGA variants are the first asynchronous federated learning algorithms for vertically partitioned data.

Federated Learning Privacy Preserving

Large-scale ASR Domain Adaptation using Self- and Semi-supervised Learning

no code implementations1 Oct 2021 Dongseong Hwang, Ananya Misra, Zhouyuan Huo, Nikhil Siddhartha, Shefali Garg, David Qiu, Khe Chai Sim, Trevor Strohman, Françoise Beaufays, Yanzhang He

Self- and semi-supervised learning methods have been actively investigated to reduce labeled training data or enhance the model performance.

Domain Adaptation

Pseudo Label Is Better Than Human Label

no code implementations22 Mar 2022 Dongseong Hwang, Khe Chai Sim, Zhouyuan Huo, Trevor Strohman

State-of-the-art automatic speech recognition (ASR) systems are trained with tens of thousands of hours of labeled speech data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

JOIST: A Joint Speech and Text Streaming Model For ASR

no code implementations13 Oct 2022 Tara N. Sainath, Rohit Prabhavalkar, Ankur Bapna, Yu Zhang, Zhouyuan Huo, Zhehuai Chen, Bo Li, Weiran Wang, Trevor Strohman

In addition, we explore JOIST using a streaming E2E model with an order of magnitude more data, which are also novelties compared to previous works.

Resource-Efficient Transfer Learning From Speech Foundation Model Using Hierarchical Feature Fusion

no code implementations4 Nov 2022 Zhouyuan Huo, Khe Chai Sim, Bo Li, Dongseong Hwang, Tara N. Sainath, Trevor Strohman

Experimental results show that the proposed method can achieve better performance on speech recognition task than existing algorithms with fewer number of trainable parameters, less computational memory cost and faster training speed.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Cannot find the paper you are looking for? You can Submit a new open access paper.