no code implementations • 28 Jun 2023 • Krishnakumar Balasubramanian, Larry Goldstein, Nathan Ross, Adil Salim
Specializing our general result, we obtain the first bounds on the Gaussian random field approximation of wide random neural networks of any depth and Lipschitz activation functions at the random field level.
no code implementations • 20 Jun 2023 • Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Harkirat Singh Behl, Xin Wang, Sébastien Bubeck, Ronen Eldan, Adam Tauman Kalai, Yin Tat Lee, Yuanzhi Li
Despite this small scale, phi-1 attains pass@1 accuracy 50. 6% on HumanEval and 55. 5% on MBPP.
Ranked #41 on Code Generation on HumanEval
no code implementations • 10 Apr 2023 • Michael Diao, Krishnakumar Balasubramanian, Sinho Chewi, Adil Salim
Of key interest in statistics and machine learning is Gaussian VI, which approximates $\pi$ by minimizing the Kullback-Leibler (KL) divergence to $\pi$ over the space of Gaussians.
no code implementations • 22 Sep 2022 • Sitan Chen, Sinho Chewi, Jerry Li, Yuanzhi Li, Adil Salim, Anru R. Zhang
We provide theoretical convergence guarantees for score-based generative models (SGMs) such as denoising diffusion probabilistic models (DDPMs), which constitute the backbone of large-scale real-world generative models such as DALL$\cdot$E 2.
no code implementations • 2 Jun 2022 • Lukang Sun, Adil Salim, Peter Richtárik
Federated learning uses a set of techniques to efficiently distribute the training of a machine learning algorithm across several devices, who own the training data.
no code implementations • 13 Feb 2022 • Yongxin Chen, Sinho Chewi, Adil Salim, Andre Wibisono
We study the proximal sampler of Lee, Shen, and Tian (2021) and obtain new convergence guarantees under weaker assumptions than strong log-concavity: namely, our results hold for (1) weakly log-concave targets, and (2) targets satisfying isoperimetric assumptions which allow for non-log-concavity.
no code implementations • 10 Feb 2022 • Krishnakumar Balasubramanian, Sinho Chewi, Murat A. Erdogdu, Adil Salim, Matthew Zhang
For the task of sampling from a density $\pi \propto \exp(-V)$ on $\mathbb{R}^d$, where $V$ is possibly non-convex but $L$-gradient Lipschitz, we prove that averaged Langevin Monte Carlo outputs a sample with $\varepsilon$-relative Fisher information after $O( L^2 d^2/\varepsilon^2)$ iterations.
no code implementations • 6 Jun 2021 • Adil Salim, Lukang Sun, Peter Richtárik
We first establish the convergence of the algorithm.
no code implementations • 22 Feb 2021 • Adil Salim, Laurent Condat, Dmitry Kovalev, Peter Richtárik
Optimization problems under affine constraints appear in various areas of machine learning.
Optimization and Control
no code implementations • NeurIPS 2020 • Dmitry Kovalev, Adil Salim, Peter Richtárik
We propose two new algorithms for this decentralized optimization problem and equip them with complexity guarantees.
no code implementations • NeurIPS 2020 • Anna Korba, Adil Salim, Michael Arbel, Giulia Luise, Arthur Gretton
We study the Stein Variational Gradient Descent (SVGD) algorithm, which optimises a set of particles to approximate a target probability distribution $\pi\propto e^{-V}$ on $\mathbb{R}^d$.
no code implementations • NeurIPS 2020 • Adil Salim, Peter Richtárik
In the second part of this paper, we use the duality gap arising from the first part to study the complexity of the Proximal Stochastic Gradient Langevin Algorithm (PSGLA), which can be seen as a generalization of the Projected Langevin Algorithm.
no code implementations • 3 Apr 2020 • Adil Salim, Laurent Condat, Konstantin Mishchenko, Peter Richtárik
We consider minimizing the sum of three convex functions, where the first one F is smooth, the second one is nonsmooth and proximable and the third one is the composition of a nonsmooth proximable function with a linear operator L. This template problem has many applications, for instance, in image processing and machine learning.
no code implementations • NeurIPS 2020 • Adil Salim, Anna Korba, Giulia Luise
Using techniques from convex optimization and optimal transport, we analyze the FB scheme as a minimization algorithm on the Wasserstein space.
no code implementations • 20 Dec 2019 • Sélim Chraibi, Ahmed Khaled, Dmitry Kovalev, Peter Richtárik, Adil Salim, Martin Takáč
We propose basic and natural assumptions under which iterative optimization methods with compressed iterates can be analyzed.
no code implementations • 25 Sep 2019 • Sélim Chraibi, Adil Salim, Samuel Horváth, Filip Hanzely, Peter Richtárik
Preconditioning an minimization algorithm improve its convergence and can lead to a minimizer in one iteration in some extreme cases.
1 code implementation • NeurIPS 2019 • Michael Arbel, Anna Korba, Adil Salim, Arthur Gretton
We construct a Wasserstein gradient flow of the maximum mean discrepancy (MMD) and study its convergence properties.
1 code implementation • NeurIPS 2019 • Adil Salim, Dmitry Kovalev, Peter Richtárik
We propose a new algorithm---Stochastic Proximal Langevin Algorithm (SPLA)---for sampling from a log concave distribution.
no code implementations • 23 Jan 2019 • Pascal Bianchi, Walid Hachem, Adil Salim
The proposed algorithm is proven to converge to a saddle point of the Lagrangian function.
no code implementations • 3 Apr 2018 • Adil Salim, Pascal Bianchi, Walid Hachem
The Douglas Rachford algorithm is an algorithm that converges to a minimizer of a sum of two convex functions.
no code implementations • 19 Dec 2017 • Adil Salim, Pascal Bianchi, Walid Hachem
When applying the proximal gradient algorithm to solve this problem, there exist quite affordable methods to implement the proximity operator (backward step) in the special case where the graph is a simple path without loops.