Search Results for author: Mingyang Yi

Found 17 papers, 5 papers with code

Continuous-time Riemannian SGD and SVRG Flows on Wasserstein Probabilistic Space

no code implementations24 Jan 2024 Mingyang Yi, Bohan Wang

In this paper, we aim to enrich the continuous optimization methods in the Wasserstein space by extending the gradient flow into the stochastic gradient descent (SGD) flow and stochastic variance reduction gradient (SVRG) flow.

Stochastic Optimization

SA-Solver: Stochastic Adams Solver for Fast Sampling of Diffusion Models

1 code implementation NeurIPS 2023 Shuchen Xue, Mingyang Yi, Weijian Luo, Shifeng Zhang, Jiacheng Sun, Zhenguo Li, Zhi-Ming Ma

Based on our analysis, we propose SA-Solver, which is an improved efficient stochastic Adams method for solving diffusion SDE to generate data with high quality.

Image Generation

On the Generalization of Diffusion Model

no code implementations24 May 2023 Mingyang Yi, Jiacheng Sun, Zhenguo Li

To understand this contradiction, we empirically verify the difference between the sufficiently trained diffusion model and the empirical optima.

Breaking Correlation Shift via Conditional Invariant Regularizer

no code implementations14 Jul 2022 Mingyang Yi, Ruoyu Wang, Jiachen Sun, Zhenguo Li, Zhi-Ming Ma

The correlation shift is caused by the spurious attributes that correlate to the class label, as the correlation between them may vary in training and test data.

Out-of-distribution Generalization with Causal Invariant Transformations

no code implementations CVPR 2022 Ruoyu Wang, Mingyang Yi, Zhitang Chen, Shengyu Zhu

In this work, we obviate these assumptions and tackle the OOD problem without explicitly recovering the causal feature.

Out-of-Distribution Generalization

Towards the Generalization of Contrastive Self-Supervised Learning

1 code implementation1 Nov 2021 Weiran Huang, Mingyang Yi, Xuyang Zhao, Zihao Jiang

It reveals that the generalization ability of contrastive self-supervised learning is related to three key factors: alignment of positive samples, divergence of class centers, and concentration of augmented data.

Contrastive Learning Data Augmentation +1

Improving OOD Generalization with Causal Invariant Transformations

no code implementations29 Sep 2021 Ruoyu Wang, Mingyang Yi, Shengyu Zhu, Zhitang Chen

In this work, we obviate these assumptions and tackle the OOD problem without explicitly recovering the causal feature.

Improved OOD Generalization via Adversarial Training and Pre-training

no code implementations24 May 2021 Mingyang Yi, Lu Hou, Jiacheng Sun, Lifeng Shang, Xin Jiang, Qun Liu, Zhi-Ming Ma

In this paper, after defining OOD generalization via Wasserstein distance, we theoretically show that a model robust to input perturbation generalizes well on OOD data.

Image Classification Natural Language Understanding

Reweighting Augmented Samples by Minimizing the Maximal Expected Loss

1 code implementation ICLR 2021 Mingyang Yi, Lu Hou, Lifeng Shang, Xin Jiang, Qun Liu, Zhi-Ming Ma

Inspired by adversarial training, we minimize this maximal expected loss (MMEL) and obtain a simple and interpretable closed-form solution: more attention should be paid to augmented samples with large loss values (i. e., harder examples).

Image Augmentation Image Classification +1

Accelerating Training of Batch Normalization: A Manifold Perspective

no code implementations8 Jan 2021 Mingyang Yi

The network with BN is invariant to positively linear re-scale transformation, which makes there exist infinite functionally equivalent networks with different scales of weights.

BN-invariant sharpness regularizes the training model to better generalization

no code implementations8 Jan 2021 Mingyang Yi, Huishuai Zhang, Wei Chen, Zhi-Ming Ma, Tie-Yan Liu

However, it has been pointed out that the usual definitions of sharpness, which consider either the maxima or the integral of loss over a $\delta$ ball of parameters around minima, cannot give consistent measurement for scale invariant neural networks, e. g., networks with batch normalization layer.

Characterization of Excess Risk for Locally Strongly Convex Population Risk

1 code implementation4 Dec 2020 Mingyang Yi, Ruoyu Wang, Zhi-Ming Ma

Our bounds underscore that with locally strongly convex population risk, the models trained by any proper iterative algorithm can generalize well, even for non-convex problems, and $d$ is large.

THE EFFECT OF ADVERSARIAL TRAINING: A THEORETICAL CHARACTERIZATION

no code implementations25 Sep 2019 Mingyang Yi, Huishuai Zhang, Wei Chen, Zhi-Ming Ma, Tie-Yan Liu

It has widely shown that adversarial training (Madry et al., 2018) is effective in defending adversarial attack empirically.

Adversarial Attack

STABILITY AND CONVERGENCE THEORY FOR LEARNING RESNET: A FULL CHARACTERIZATION

no code implementations25 Sep 2019 Huishuai Zhang, Da Yu, Mingyang Yi, Wei Chen, Tie-Yan Liu

We show that for standard initialization used in practice, $\tau =1/\Omega(\sqrt{L})$ is a sharp value in characterizing the stability of forward/backward process of ResNet, where $L$ is the number of residual blocks.

Optimization on Multiple Manifolds

no code implementations ICLR 2019 Mingyang Yi, Huishuai Zhang, Wei Chen, Zhi-Ming Ma, Tie-Yan Liu

Optimization on manifold has been widely used in machine learning, to handle optimization problems with constraint.

Stabilize Deep ResNet with A Sharp Scaling Factor $τ$

1 code implementation17 Mar 2019 Huishuai Zhang, Da Yu, Mingyang Yi, Wei Chen, Tie-Yan Liu

Moreover, for ResNets with normalization layer, adding such a factor $\tau$ also stabilizes the training and obtains significant performance gain for deep ResNet.

Positively Scale-Invariant Flatness of ReLU Neural Networks

no code implementations6 Mar 2019 Mingyang Yi, Qi Meng, Wei Chen, Zhi-Ming Ma, Tie-Yan Liu

That is to say, the minimum with balanced values of basis paths will more likely to be flatter and generalize better.

Cannot find the paper you are looking for? You can Submit a new open access paper.