Search Results for author: Bohan Wang

Found 26 papers, 5 papers with code

On the Convergence of Adam under Non-uniform Smoothness: Separability from SGDM and Beyond

no code implementations • 22 Mar 2024 • Bohan Wang, Huishuai Zhang, Qi Meng, Ruoyu Sun, Zhi-Ming Ma, Wei Chen

This paper aims to clearly distinguish between Stochastic Gradient Descent with Momentum (SGDM) and Adam in terms of their convergence rates.

Paper
Add Code

Continuous-time Riemannian SGD and SVRG Flows on Wasserstein Probabilistic Space

no code implementations • 24 Jan 2024 • Mingyang Yi, Bohan Wang

In this paper, we aim to enrich the continuous optimization methods in the Wasserstein space by extending the gradient flow into the stochastic gradient descent (SGD) flow and stochastic variance reduction gradient (SVRG) flow.

Stochastic Optimization

Paper
Add Code

Large Catapults in Momentum Gradient Descent with Warmup: An Empirical Study

no code implementations • 25 Nov 2023 • Prin Phunyaphibarn, Junghyun Lee, Bohan Wang, Huishuai Zhang, Chulhee Yun

Although gradient descent with momentum is widely used in modern deep learning, a concrete understanding of its effects on the training trajectory still remains elusive.

Paper
Add Code

Closing the Gap Between the Upper Bound and the Lower Bound of Adam's Iteration Complexity

no code implementations • 27 Oct 2023 • Bohan Wang, Jingwen Fu, Huishuai Zhang, Nanning Zheng, Wei Chen

Recently, Arjevani et al. [1] established a lower bound of iteration complexity for the first-order optimization under an $L$-smooth condition and a bounded noise variance assumption.

LEMMA valid

Paper
Add Code

How Can Large Language Models Help Humans in Design and Manufacturing?

no code implementations • 25 Jul 2023 • Liane Makatura, Michael Foshey, Bohan Wang, Felix HähnLein, Pingchuan Ma, Bolei Deng, Megan Tjandrasuwita, Andrew Spielberg, Crystal Elaine Owens, Peter Yichen Chen, Allan Zhao, Amy Zhu, Wil J Norton, Edward Gu, Joshua Jacob, Yifei Li, Adriana Schulz, Wojciech Matusik

The advancement of Large Language Models (LLMs), including GPT-4, provides exciting new opportunities for generative design.

Paper
Add Code

Fast Conditional Mixing of MCMC Algorithms for Non-log-concave Distributions

no code implementations • NeurIPS 2023 • Xiang Cheng, Bohan Wang, Jingzhao Zhang, Yusong Zhu

However, on the theory side, MCMC algorithms suffer from slow mixing rate when $\pi(x)$ is non-log-concave.

Paper
Add Code

When and Why Momentum Accelerates SGD:An Empirical Study

no code implementations • 15 Jun 2023 • Jingwen Fu, Bohan Wang, Huishuai Zhang, Zhizheng Zhang, Wei Chen, Nanning Zheng

In the comparison of SGDM and SGD with the same effective learning rate and the same batch size, we observe a consistent pattern: when $\eta_{ef}$ is small, SGDM and SGD experience almost the same empirical training losses; when $\eta_{ef}$ surpasses a certain threshold, SGDM begins to perform better than SGD.

Paper
Add Code

ALO-VC: Any-to-any Low-latency One-shot Voice Conversion

no code implementations • 1 Jun 2023 • Bohan Wang, Damien Ronssin, Milos Cernak

This paper presents ALO-VC, a non-parallel low-latency one-shot phonetic posteriorgrams (PPGs) based voice conversion method.

Voice Conversion

Paper
Add Code

Convergence of AdaGrad for Non-convex Objectives: Simple Proofs and Relaxed Assumptions

no code implementations • 29 May 2023 • Bohan Wang, Huishuai Zhang, Zhi-Ming Ma, Wei Chen

We provide a simple convergence proof for AdaGrad optimizing non-convex objectives under only affine noise variance and bounded smoothness assumptions.

Paper
Add Code

On the Trade-off of Intra-/Inter-class Diversity for Supervised Pre-training

no code implementations • NeurIPS 2023 • Jieyu Zhang, Bohan Wang, Zhengyu Hu, Pang Wei Koh, Alexander Ratner

Pre-training datasets are critical for building state-of-the-art machine learning models, motivating rigorous study on their impact on downstream tasks.

Paper
Add Code

O-GNN: Incorporating Ring Priors into Molecular Modeling

1 code implementation • ICLR 2023 • Jinhua Zhu, Kehan Wu, Bohan Wang, Yingce Xia, Shufang Xie, Qi Meng, Lijun Wu, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu

Despite the recent success of molecular modeling with graph neural networks (GNNs), few models explicitly take rings in compounds into consideration, consequently limiting the expressiveness of the models.

Ranked #1 on Graph Regression on PCQM4M-LSC (Validation MAE metric)

Graph Regression Molecular Property Prediction +3

Paper
Code

InfraDet3D: Multi-Modal 3D Object Detection based on Roadside Infrastructure Camera and LiDAR Sensors

no code implementations • 29 Apr 2023 • Walter Zimmer, Joseph Birkner, Marcel Brucker, Huu Tung Nguyen, Stefan Petrovski, Bohan Wang, Alois C. Knoll

We evaluate our results on the A9 infrastructure dataset and achieve 68. 48 mAP on the test set.

3D Object Detection Autonomous Driving +2

Paper
Add Code

Regularization of polynomial networks for image recognition

no code implementations • CVPR 2023 • Grigorios G Chrysos, Bohan Wang, Jiankang Deng, Volkan Cevher

We introduce a class of PNs, which are able to reach the performance of ResNet across a range of six benchmarks.

Paper
Add Code

DiGress: Discrete Denoising diffusion for graph generation

1 code implementation • 29 Sep 2022 • Clement Vignac, Igor Krawczuk, Antoine Siraudin, Bohan Wang, Volkan Cevher, Pascal Frossard

This work introduces DiGress, a discrete denoising diffusion model for generating graphs with categorical node and edge attributes.

Denoising Edge Classification +1

287

Paper
Code

Provable Adaptivity in Adam

no code implementations • 21 Aug 2022 • Bohan Wang, Yushun Zhang, Huishuai Zhang, Qi Meng, Zhi-Ming Ma, Tie-Yan Liu, Wei Chen

In particular, the existing analysis of Adam cannot clearly demonstrate the advantage of Adam over SGD.

Attribute

Paper
Add Code

See More for Scene: Pairwise Consistency Learning for Scene Classification

no code implementations • NeurIPS 2021 • Gongwei Chen, Xinhang Song, Bohan Wang, Shuqiang Jiang

In this paper, we propose to understand scene images and the scene classification CNN models in terms of the focus area.

Classification Scene Classification

Paper
Add Code

Optimizing Information-theoretical Generalization Bounds via Anisotropic Noise in SGLD

no code implementations • NeurIPS 2021 • Bohan Wang, Huishuai Zhang, Jieyu Zhang, Qi Meng, Wei Chen, Tie-Yan Liu

We prove that with constraint to guarantee low empirical risk, the optimal noise covariance is the square root of the expected gradient covariance if both the prior and the posterior are jointly optimized.

Generalization Bounds

Paper
Add Code

Does Momentum Change the Implicit Regularization on Separable Data?

no code implementations • 8 Oct 2021 • Bohan Wang, Qi Meng, Huishuai Zhang, Ruoyu Sun, Wei Chen, Zhi-Ming Ma, Tie-Yan Liu

The momentum acceleration technique is widely adopted in many optimization algorithms.

Paper
Add Code

Creating Training Sets via Weak Indirect Supervision

no code implementations • ICLR 2022 • Jieyu Zhang, Bohan Wang, Xiangchen Song, Yujing Wang, Yaming Yang, Jing Bai, Alexander Ratner

Creating labeled training sets has become one of the major roadblocks in machine learning.

text-classification Text Classification

Paper
Add Code

Machine-Learning Non-Conservative Dynamics for New-Physics Detection

no code implementations • 31 May 2021 • Ziming Liu, Bohan Wang, Qi Meng, Wei Chen, Max Tegmark, Tie-Yan Liu

Energy conservation is a basic physics principle, the breakdown of which often implies new physics.

BIG-bench Machine Learning Friction

Paper
Add Code

Optimizing Information-theoretical Generalization Bound via Anisotropic Noise of SGLD

no code implementations • NeurIPS 2021 • Bohan Wang, Huishuai Zhang, Jieyu Zhang, Qi Meng, Wei Chen, Tie-Yan Liu

Generalization Bounds

Paper
Add Code

End-to-End Human Object Interaction Detection with HOI Transformer

1 code implementation • CVPR 2021 • Cheng Zou, Bohan Wang, Yue Hu, Junqi Liu, Qian Wu, Yu Zhao, Boxun Li, Chenguang Zhang, Chi Zhang, Yichen Wei, Jian Sun

We propose HOI Transformer to tackle human object interaction (HOI) detection in an end-to-end manner.

Ranked #30 on Human-Object Interaction Detection on HICO-DET (using extra training data)

Human-Object Interaction Detection object-detection +1

137

Paper
Code

Robustness, Privacy, and Generalization of Adversarial Training

1 code implementation • 25 Dec 2020 • Fengxiang He, Shaopeng Fu, Bohan Wang, DaCheng Tao

This measure can be approximate empirically by an asymptotically consistent empirical estimator, {\it empirical robustified intensity}.

Generalization Bounds Privacy Preserving

Paper
Code

The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks

1 code implementation • 11 Dec 2020 • Bohan Wang, Qi Meng, Wei Chen, Tie-Yan Liu

Except GD, adaptive algorithms such as AdaGrad, RMSProp and Adam are popular owing to their rapid training process.

Paper
Code

Tighter Generalization Bounds for Iterative Differentially Private Learning Algorithms

no code implementations • 18 Jul 2020 • Fengxiang He, Bohan Wang, DaCheng Tao

This paper studies the relationship between generalization and privacy preservation in iterative learning algorithms by two sequential steps.

Federated Learning Generalization Bounds

Paper
Add Code

Piecewise linear activations substantially shape the loss surfaces of neural networks

no code implementations • ICLR 2020 • Fengxiang He, Bohan Wang, DaCheng Tao

This result holds for any neural network with arbitrary depth and arbitrary piecewise linear activation functions (excluding linear functions) under most loss functions in practice.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.