Search Results for author: Ruoyu Sun

Found 25 papers, 3 papers with code

Achieving Small Test Error in Mildly Overparameterized Neural Networks

no code implementations24 Apr 2021 Shiyu Liang, Ruoyu Sun, R. Srikant

Recent theoretical works on over-parameterized neural nets have focused on two aspects: optimization and generalization.

Precondition Layer and Its Use for GANs

no code implementations1 Jan 2021 Tiantian Fang, Alex Schwing, Ruoyu Sun

We use this PC-layer in two ways: 1) fixed preconditioning (FPC) adds a fixed PC-layer to all layers, and 2) adaptive preconditioning (APC) adaptively controls the strength of preconditioning.

RMSprop can converge with proper hyper-parameter

no code implementations ICLR 2021 Naichen Shi, Dawei Li, Mingyi Hong, Ruoyu Sun

Removing this assumption allows us to establish a phase transition from divergence to non-divergence for RMSProp.

On the Landscape of Sparse Linear Networks

no code implementations1 Jan 2021 Dachao Lin, Ruoyu Sun, Zhihua Zhang

Network pruning, or sparse network has a long history and practical significance in modern applications.

Network Pruning

On a Faster $R$-Linear Convergence Rate of the Barzilai-Borwein Method

no code implementations1 Jan 2021 Dawei Li, Ruoyu Sun

The Barzilai-Borwein (BB) method has demonstrated great empirical success in nonlinear optimization.

Towards a Better Global Loss Landscape of GANs

1 code implementation NeurIPS 2020 Ruoyu Sun, Tiantian Fang, Alex Schwing

We also perform experiments to support our theory that RpGAN has a better landscape than separable-GAN.

Center-wise Local Image Mixture For Contrastive Representation Learning

no code implementations5 Nov 2020 Hao Li, Xiaopeng Zhang, Ruoyu Sun, Hongkai Xiong, Qi Tian

This is achieved by searching local similar samples of an image, and only selecting images that are closer to the corresponding cluster center, which we denote as center-wise local selection.

Contrastive Learning Data Augmentation +3

A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems

no code implementations NeurIPS 2020 Jiawei Zhang, Peijun Xiao, Ruoyu Sun, Zhi-Quan Luo

We prove that the stabilized GDA algorithm can achieve an $O(1/\epsilon^2)$ iteration complexity for minimizing the pointwise maximum of a finite collection of nonconvex functions.

On the Landscape of One-hidden-layer Sparse Networks and Beyond

no code implementations16 Sep 2020 Dachao Lin, Ruoyu Sun, Zhihua Zhang

We show that sparse linear networks can have spurious strict minima, which is in sharp contrast to dense linear networks which do not even have spurious minima.

Network Pruning

The Global Landscape of Neural Networks: An Overview

no code implementations2 Jul 2020 Ruoyu Sun, Dawei Li, Shiyu Liang, Tian Ding, R. Srikant

Second, we discuss a few rigorous results on the geometric properties of wide networks such as "no bad basin", and some modifications that eliminate sub-optimal local minima and/or decreasing paths to infinity.

Global Convergence and Generalization Bound of Gradient-Based Meta-Learning with Deep Neural Nets

2 code implementations25 Jun 2020 Haoxiang Wang, Ruoyu Sun, Bo Li

Gradient-based meta-learning (GBML) with deep neural nets (DNNs) has become a popular approach for few-shot learning.

Few-Shot Learning

Distilling Object Detectors with Task Adaptive Regularization

no code implementations23 Jun 2020 Ruoyu Sun, Fuhui Tang, Xiaopeng Zhang, Hongkai Xiong, Qi Tian

Knowledge distillation, which aims at training a smaller student network by transferring knowledge from a larger teacher model, is one of the promising solutions for model miniaturization.

Knowledge Distillation Region Proposal

DEED: A General Quantization Scheme for Communication Efficiency in Bits

no code implementations19 Jun 2020 Tian Ye, Peijun Xiao, Ruoyu Sun

In the infrequent communication setting, DEED combined with Federated averaging requires a smaller total number of bits than Federated Averaging.

Distributed Optimization Federated Learning +1

Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity

no code implementations31 Dec 2019 Shiyu Liang, Ruoyu Sun, R. Srikant

More specifically, for a large class of over-parameterized deep neural networks with appropriate regularizers, the loss function has no bad local minima and no decreasing paths to infinity.

Optimization for deep learning: theory and algorithms

no code implementations19 Dec 2019 Ruoyu Sun

When and why can a neural network be successfully trained?

Learning Theory

Sub-Optimal Local Minima Exist for Neural Networks with Almost All Non-Linear Activations

no code implementations4 Nov 2019 Tian Ding, Dawei Li, Ruoyu Sun

More specifically, we prove that for any multi-layer network with generic input data and non-linear activation functions, sub-optimal local minima can exist, no matter how wide the network is (as long as the last hidden layer has at least two neurons).

Understanding Limitation of Two Symmetrized Orders by Worst-case Complexity

no code implementations10 Oct 2019 Peijun Xiao, Zhisheng Xiao, Ruoyu Sun

Recently, Coordinate Descent (CD) with cyclic order was shown to be $O(n^2)$ times slower than randomized versions in the worst-case.

Off-road Autonomous Vehicles Traversability Analysis and Trajectory Planning Based on Deep Inverse Reinforcement Learning

no code implementations16 Sep 2019 Zeyu Zhu, Nan Li, Ruoyu Sun, Huijing Zhao, Donghao Xu

Different cost functions of traversability analysis are learned and tested at various scenes of capability in guiding the trajectory planning of different behaviors.

Autonomous Vehicles Trajectory Planning

Max-Sliced Wasserstein Distance and its use for GANs

no code implementations CVPR 2019 Ishan Deshpande, Yuan-Ting Hu, Ruoyu Sun, Ayis Pyrros, Nasir Siddiqui, Sanmi Koyejo, Zhizhen Zhao, David Forsyth, Alexander Schwing

Generative adversarial nets (GANs) and variational auto-encoders have significantly improved our distribution modeling capabilities, showing promise for dataset augmentation, image-to-image translation and feature learning.

Image-to-Image Translation

On the Benefit of Width for Neural Networks: Disappearance of Bad Basins

no code implementations28 Dec 2018 Dawei Li, Tian Ding, Ruoyu Sun

Wide networks are often believed to have a nice optimization landscape, but what rigorous results can we prove?

On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization

no code implementations ICLR 2019 Xiangyi Chen, Sijia Liu, Ruoyu Sun, Mingyi Hong

We prove that under our derived conditions, these methods can achieve the convergence rate of order $O(\log{T}/\sqrt{T})$ for nonconvex stochastic optimization.

Stochastic Optimization

Adding One Neuron Can Eliminate All Bad Local Minima

no code implementations NeurIPS 2018 Shiyu Liang, Ruoyu Sun, Jason D. Lee, R. Srikant

One of the main difficulties in analyzing neural networks is the non-convexity of the loss function which may have many bad local minima.

General Classification

Understanding the Loss Surface of Neural Networks for Binary Classification

no code implementations ICML 2018 Shiyu Liang, Ruoyu Sun, Yixuan Li, R. Srikant

Here we focus on the training performance of single-layered neural networks for binary classification, and provide conditions under which the training error is zero at all local minima of a smooth hinge loss function.

Classification General Classification

Training Language Models Using Target-Propagation

1 code implementation15 Feb 2017 Sam Wiseman, Sumit Chopra, Marc'Aurelio Ranzato, Arthur Szlam, Ruoyu Sun, Soumith Chintala, Nicolas Vasilache

While Truncated Back-Propagation through Time (BPTT) is the most popular approach to training Recurrent Neural Networks (RNNs), it suffers from being inherently sequential (making parallelization difficult) and from truncating gradient flow between distant time-steps.

Guaranteed Matrix Completion via Non-convex Factorization

no code implementations28 Nov 2014 Ruoyu Sun, Zhi-Quan Luo

In this paper, we establish a theoretical guarantee for the factorization formulation to correctly recover the underlying low-rank matrix.

Matrix Completion

Cannot find the paper you are looking for? You can Submit a new open access paper.