Search Results for author: Wenqing Hu

Found 8 papers, 1 papers with code

Effective Subspace Indexing via Interpolation on Stiefel and Grassmann manifolds

no code implementations1 Jan 2021 Wenqing Hu, Tiefeng Jiang, Zhu Li

We propose a novel local Subspace Indexing Model with Interpolation (SIM-I) for low-dimensional embedding of image datasets.

On the Noisy Gradient Descent that Generalizes as SGD

1 code implementation ICML 2020 Jingfeng Wu, Wenqing Hu, Haoyi Xiong, Jun Huan, Vladimir Braverman, Zhanxing Zhu

The gradient noise of SGD is considered to play a central role in the observed strong generalization abilities of deep learning.

SHE2: Stochastic Hamiltonian Exploration and Exploitation for Derivative-Free Optimization

no code implementations ICLR 2019 Haoyi Xiong, Wenqing Hu, Zhanxing Zhu, Xinjian Li, Yunchao Zhang, Jun Huan

Derivative-free optimization (DFO) using trust region methods is frequently used for machine learning applications, such as (hyper-)parameter optimization without the derivatives of objective functions known.

BIG-bench Machine Learning Text-to-Image Generation

Quasi-potential as an implicit regularizer for the loss function in the stochastic gradient descent

no code implementations18 Jan 2019 Wenqing Hu, Zhanxing Zhu, Haoyi Xiong, Jun Huan

We show in this case that the quasi-potential function is related to the noise covariance structure of SGD via a partial differential equation of Hamilton-Jacobi type.

Relation Variational Inference

A convergence analysis of the perturbed compositional gradient flow: averaging principle and normal deviations

no code implementations2 Sep 2017 Wenqing Hu, Chris Junchi Li

By introducing a separation of fast and slow scales of the two equations, we show that the limit of the slow motion is given by an averaged ordinary differential equation.

On the diffusion approximation of nonconvex stochastic gradient descent

no code implementations22 May 2017 Wenqing Hu, Chris Junchi Li, Lei LI, Jian-Guo Liu

In addition, we discuss the effects of batch size for the deep neural networks, and we find that small batch size is helpful for SGD algorithms to escape unstable stationary points and sharp minimizers.

FWDA: a Fast Wishart Discriminant Analysis with its Application to Electronic Health Records Data Classification

no code implementations25 Apr 2017 Haoyi Xiong, Wei Cheng, Wenqing Hu, Jiang Bian, Zhishan Guo

Classical LDA for EHR data classification, however, suffers from two handicaps: the ill-posed estimation of LDA parameters (e. g., covariance matrix), and the "linear inseparability" of EHR data.

Classification General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.