Search Results for author: Belhal Karimi

Found 16 papers, 0 papers with code

Non-asymptotic Analysis of Biased Stochastic Approximation Scheme

no code implementations • 2 Feb 2019 • Belhal Karimi, Blazej Miasojedow, Eric Moulines, Hoi-To Wai

We illustrate these settings with the online EM algorithm and the policy-gradient method for average reward maximization in reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

An Optimistic Acceleration of AMSGrad for Nonconvex Optimization

no code implementations • ICLR 2020 • Jun-Kun Wang, Xiaoyun Li, Belhal Karimi, Ping Li

We propose a new variant of AMSGrad, a popular adaptive gradient based optimization algorithm widely used for training deep neural networks.

Paper
Add Code

On the Global Convergence of (Fast) Incremental Expectation Maximization Methods

no code implementations • NeurIPS 2019 • Belhal Karimi, Hoi-To Wai, Eric Moulines, Marc Lavielle

To alleviate this problem, Neal and Hinton have proposed an incremental version of the EM (iEM) in which at each iteration the conditional expectation of the latent data (E-step) is updated only for a mini-batch of observations.

Paper
Add Code

FedSKETCH: Communication-Efficient and Private Federated Learning via Sketching

no code implementations • 11 Aug 2020 • Farzin Haddadpour, Belhal Karimi, Ping Li, Xiaoyun Li

Communication complexity and privacy are the two key challenges in Federated Learning where the goal is to perform a distributed learning through a large volume of devices.

Federated Learning

Paper
Add Code

HWA: Hyperparameters Weight Averaging in Bayesian Neural Networks

no code implementations • pproximateinference AABI Symposium 2021 • Belhal Karimi, Ping Li

Bayesian neural networks attempt to combine the strong predictive performance of neural networks with formal quantification of uncertainty of the predicted output in the Bayesian framework.

Paper
Add Code

Towards Better Generalization of Adaptive Gradient Methods

no code implementations • NeurIPS 2020 • Yingxue Zhou, Belhal Karimi, Jinxing Yu, Zhiqiang Xu, Ping Li

Adaptive gradient methods such as AdaGrad, RMSprop and Adam have been optimizers of choice for deep learning due to their fast training speed.

Paper
Add Code

MISSO: Minimization by Incremental Stochastic Surrogate Optimization for Large Scale Nonconvex and Nonsmooth Problems

no code implementations • 1 Jan 2021 • Belhal Karimi, Hoi To Wai, Eric Moulines, Ping Li

Many constrained, nonconvex and nonsmooth optimization problems can be tackled using the majorization-minimization (MM) method which alternates between constructing a surrogate function which upper bounds the objective function, and then minimizing this surrogate.

Variational Inference

Paper
Add Code

Convergent Adaptive Gradient Methods in Decentralized Optimization

no code implementations • 1 Jan 2021 • Xiangyi Chen, Belhal Karimi, Weijie Zhao, Ping Li

Specifically, we propose a general algorithmic framework that can convert existing adaptive gradient methods to their decentralized counterparts.

Distributed Optimization

Paper
Add Code

On the Convergence of Decentralized Adaptive Gradient Methods

no code implementations • 7 Sep 2021 • Xiangyi Chen, Belhal Karimi, Weijie Zhao, Ping Li

Adaptive gradient methods including Adam, AdaGrad, and their variants have been very successful for training deep learning models, such as neural networks.

Distributed Computing Distributed Optimization

Paper
Add Code

Layer-wise and Dimension-wise Locally Adaptive Federated Learning

no code implementations • 1 Oct 2021 • Belhal Karimi, Ping Li, Xiaoyun Li

In the emerging paradigm of Federated Learning (FL), large amount of clients such as mobile devices are used to train possibly high-dimensional models on their respective data.

Federated Learning

Paper
Add Code

A Class of Two-Timescale Stochastic EM Algorithms for Nonconvex Latent Variable Models

no code implementations • 18 Mar 2022 • Belhal Karimi, Ping Li

We motivate the choice of a double dynamic by invoking the variance reduction virtue of each stage of the method on both sources of noise: the index sampling for the incremental update and the MC approximation.

Paper
Add Code

Joint learning of object graph and relation graph for visual question answering

no code implementations • 9 May 2022 • Hao Li, Xu Li, Belhal Karimi, Jie Chen, Mingming Sun

Modeling visual question answering(VQA) through scene graphs can significantly improve the reasoning accuracy and interpretability.

Attribute Question Answering +2

Paper
Add Code

On Distributed Adaptive Optimization with Gradient Compression

no code implementations • ICLR 2022 • Xiaoyun Li, Belhal Karimi, Ping Li

We study COMP-AMS, a distributed optimization framework based on gradient averaging and adaptive AMSGrad algorithm.

Distributed Optimization

Paper
Add Code

Variational Flow Graphical Model

no code implementations • 6 Jul 2022 • Shaogang Ren, Belhal Karimi, Dingcheng Li, Ping Li

VFGs learn the representation of high dimensional data via a message-passing scheme by integrating flow-based functions through variational inference.

Representation Learning Variational Inference

Paper
Add Code

FeatureBox: Feature Engineering on GPUs for Massive-Scale Ads Systems

no code implementations • 26 Sep 2022 • Weijie Zhao, Xuewu Jiao, Xinsheng Luo, Jingxue Li, Belhal Karimi, Ping Li

In this paper, we propose FeatureBox, a novel end-to-end training framework that pipelines the feature extraction and the training on GPU servers to save the intermediate I/O of the feature extraction.

Feature Engineering Management +1

Paper
Add Code

STANLEY: Stochastic Gradient Anisotropic Langevin Dynamics for Learning Energy-Based Models

no code implementations • 19 Oct 2023 • Belhal Karimi, Jianwen Xie, Ping Li

We propose in this paper, STANLEY, a STochastic gradient ANisotropic LangEvin dYnamics, for sampling high dimensional data.

Image Generation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.