Search Results for author: Xiangru Lian

Found 20 papers, 9 papers with code

E^2VTS: Energy-Efficient Video Text Spotting from Unmanned Aerial Vehicles

1 code implementation • 5 Jun 2022 • Zhenyu Hu, Zhenyu Wu, Pengcheng Pi, Yunhe Xue, Jiayi Shen, Jianchao Tan, Xiangru Lian, Zhangyang Wang, Ji Liu

Unmanned Aerial Vehicles (UAVs) based video text spotting has been extensively used in civil and military domains.

Text Spotting

Paper
Code

Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters

1 code implementation • 10 Nov 2021 • Xiangru Lian, Binhang Yuan, XueFeng Zhu, Yulong Wang, Yongjun He, Honghuan Wu, Lei Sun, Haodong Lyu, Chengjun Liu, Xing Dong, Yiqiao Liao, Mingnan Luo, Congfei Zhang, Jingru Xie, Haonan Li, Lei Chen, Renjie Huang, Jianying Lin, Chengchun Shu, Xuezhong Qiu, Zhishan Liu, Dongying Kong, Lei Yuan, Hai Yu, Sen yang, Ce Zhang, Ji Liu

Specifically, in order to ensure both the training efficiency and the training accuracy, we design a novel hybrid training algorithm, where the embedding layer and the dense neural network are handled by different synchronization mechanisms; then we build a system called Persia (short for parallel recommendation training system with hybrid acceleration) to support this hybrid training algorithm.

Recommendation Systems

382

Paper
Code

BAGUA: Scaling up Distributed Learning with System Relaxations

1 code implementation • 3 Jul 2021 • Shaoduo Gan, Xiangru Lian, Rui Wang, Jianbin Chang, Chengjun Liu, Hongmei Shi, Shengzhuo Zhang, Xianghong Li, Tengxu Sun, Jiawei Jiang, Binhang Yuan, Sen yang, Ji Liu, Ce Zhang

Recent years have witnessed a growing list of systems for distributed data-parallel training.

Distributed Optimization Quantization

865

Paper
Code

DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning

1 code implementation • 11 Jun 2021 • Daochen Zha, Jingru Xie, Wenye Ma, Sheng Zhang, Xiangru Lian, Xia Hu, Ji Liu

Games are abstractions of the real world, where artificial agents learn to compete and cooperate with other agents.

Game of Poker Multi-agent Reinforcement Learning +2

3,934

Paper
Code

1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed

2 code implementations • 4 Feb 2021 • Hanlin Tang, Shaoduo Gan, Ammar Ahmad Awan, Samyam Rajbhandari, Conglong Li, Xiangru Lian, Ji Liu, Ce Zhang, Yuxiong He

One of the most effective methods is error-compensated compression, which offers robust convergence speed even under 1-bit compression.

32,603

Paper
Code

APMSqueeze: A Communication Efficient Adam-Preconditioned Momentum SGD Algorithm

no code implementations • 26 Aug 2020 • Hanlin Tang, Shaoduo Gan, Samyam Rajbhandari, Xiangru Lian, Ji Liu, Yuxiong He, Ce Zhang

Adam is the important optimization algorithm to guarantee efficiency and accuracy for training many important tasks such as BERT and ImageNet.

Paper
Add Code

Stochastic Recursive Momentum for Policy Gradient Methods

no code implementations • 9 Mar 2020 • Huizhuo Yuan, Xiangru Lian, Ji Liu, Yuren Zhou

In this paper, we propose a novel algorithm named STOchastic Recursive Momentum for Policy Gradient (STORM-PG), which operates a SARAH-type stochastic recursive variance-reduced policy gradient in an exponential moving average fashion.

Policy Gradient Methods

Paper
Add Code

Stochastic Recursive Variance Reduction for Efficient Smooth Non-Convex Compositional Optimization

no code implementations • 31 Dec 2019 • Huizhuo Yuan, Xiangru Lian, Ji Liu

Such a complexity is known to be the best one among IFO complexity results for non-convex stochastic compositional optimization, and is believed to be optimal.

Management Stochastic Optimization

Paper
Add Code

Efficient Smooth Non-Convex Stochastic Compositional Optimization via Stochastic Recursive Gradient Descent

no code implementations • NeurIPS 2019 • Huizhuo Yuan, Xiangru Lian, Chris Junchi Li, Ji Liu, Wenqing Hu

Stochastic compositional optimization arises in many important machine learning tasks such as reinforcement learning and portfolio management.

Management Stochastic Optimization

Paper
Add Code

$\texttt{DeepSqueeze}$: Decentralization Meets Error-Compensated Compression

no code implementations • 17 Jul 2019 • Hanlin Tang, Xiangru Lian, Shuang Qiu, Lei Yuan, Ce Zhang, Tong Zhang, Ji Liu

Since the \emph{decentralized} training has been witnessed to be superior to the traditional \emph{centralized} training in the communication restricted scenario, therefore a natural question to ask is "how to apply the error-compensated technology to the decentralized learning to further reduce the communication cost."

Paper
Add Code

DoubleSqueeze: Parallel Stochastic Gradient Descent with Double-Pass Error-Compensated Compression

no code implementations • 15 May 2019 • Hanlin Tang, Xiangru Lian, Chen Yu, Tong Zhang, Ji Liu

For example, under the popular parameter server model for distributed learning, the worker nodes need to send the compressed local gradients to the parameter server, which performs the aggregation.

Paper
Add Code

Revisit Batch Normalization: New Understanding from an Optimization View and a Refinement via Composition Optimization

no code implementations • 15 Oct 2018 • Xiangru Lian, Ji Liu

We show when BN works and when BN does not work by analyzing the optimization problem.

Paper
Add Code

$D^2$: Decentralized Training over Decentralized Data

no code implementations • ICML 2018 • Hanlin Tang, Xiangru Lian, Ming Yan, Ce Zhang, Ji Liu

While training a machine learning model using multiple workers, each of which collects data from its own data source, it would be useful when the data collected from different workers are unique and different.

Ranked #3 on Multi-view Subspace Clustering on ORL

Image Classification Multi-view Subspace Clustering

Paper
Add Code

D$^2$: Decentralized Training over Decentralized Data

no code implementations • 19 Mar 2018 • Hanlin Tang, Xiangru Lian, Ming Yan, Ce Zhang, Ji Liu

While training a machine learning model using multiple workers, each of which collects data from their own data sources, it would be most useful when the data collected from different workers can be {\em unique} and {\em different}.

Image Classification

Paper
Add Code

PARAMETRIZED DEEP Q-NETWORKS LEARNING: PLAYING ONLINE BATTLE ARENA WITH DISCRETE-CONTINUOUS HYBRID ACTION SPACE

1 code implementation • ICLR 2018 • Jiechao Xiong, Qing Wang, Zhuoran Yang, Peng Sun, Yang Zheng, Lei Han, Haobo Fu, Xiangru Lian, Carson Eisenach, Haichuan Yang, Emmanuel Ekwedike, Bei Peng, Haoyue Gao, Tong Zhang, Ji Liu, Han Liu

Most existing deep reinforcement learning (DRL) frameworks consider action spaces that are either discrete or continuous space.

2,539

Paper
Code

Asynchronous Decentralized Parallel Stochastic Gradient Descent

3 code implementations • ICML 2018 • Xiangru Lian, Wei zhang, Ce Zhang, Ji Liu

Can we design an algorithm that is robust in a heterogeneous environment, while being communication efficient and maintaining the best-possible convergence rate?

152

Paper
Code

Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent

3 code implementations • NeurIPS 2017 • Xiangru Lian, Ce Zhang, huan zhang, Cho-Jui Hsieh, Wei zhang, Ji Liu

On network configurations with low bandwidth or high latency, D-PSGD can be up to one order of magnitude faster than its well-optimized centralized counterparts.

152

Paper
Code

Asynchronous Parallel Greedy Coordinate Descent

no code implementations • NeurIPS 2016 • Yang You, Xiangru Lian, Ji Liu, Hsiang-Fu Yu, Inderjit S. Dhillon, James Demmel, Cho-Jui Hsieh

n this paper, we propose and study an Asynchronous parallel Greedy Coordinate Descent (Asy-GCD) algorithm for minimizing a smooth function with bounded constraints.

Paper
Add Code

Staleness-aware Async-SGD for Distributed Deep Learning

1 code implementation • 18 Nov 2015 • Wei Zhang, Suyog Gupta, Xiangru Lian, Ji Liu

Deep neural networks have been shown to achieve state-of-the-art performance in several machine learning tasks.

Distributed Computing Image Classification

Paper
Code

Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization

no code implementations • NeurIPS 2015 • Xiangru Lian, Yijun Huang, Yuncheng Li, Ji Liu

Asynchronous parallel implementations of stochastic gradient (SG) have been broadly used in solving deep neural network and received many successes in practice recently.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.