Search Results for author: Sébastien Bubeck

Found 43 papers, 6 papers with code

Textbooks Are All You Need II: phi-1.5 technical report

no code implementations11 Sep 2023 Yuanzhi Li, Sébastien Bubeck, Ronen Eldan, Allie Del Giorno, Suriya Gunasekar, Yin Tat Lee

We continue the investigation into the power of smaller Transformer-based language models as initiated by \textbf{TinyStories} -- a 10 million parameter model that can produce coherent English -- and the follow-up work on \textbf{phi-1}, a 1. 3 billion parameter model with Python coding performance close to the state-of-the-art.

Common Sense Reasoning Question Answering

Sparks of Artificial General Intelligence: Early experiments with GPT-4

1 code implementation22 Mar 2023 Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, Yi Zhang

We contend that (this early version of) GPT-4 is part of a new cohort of LLMs (along with ChatGPT and Google's PaLM for example) that exhibit more general intelligence than previous AI models.

Arithmetic Reasoning Mathematical Reasoning +1

Learning threshold neurons via the "edge of stability"

no code implementations14 Dec 2022 Kwangjun Ahn, Sébastien Bubeck, Sinho Chewi, Yin Tat Lee, Felipe Suarez, Yi Zhang

For these models, we provably establish the edge of stability phenomenon and discover a sharp phase transition for the step size below which the neural network fails to learn "threshold-like" neurons (i. e., neurons with a non-zero first-layer bias).

Inductive Bias

How to Fine-Tune Vision Models with SGD

no code implementations17 Nov 2022 Ananya Kumar, Ruoqi Shen, Sébastien Bubeck, Suriya Gunasekar

SGD (with momentum) and AdamW are the two most used optimizers for fine-tuning large neural networks in computer vision.

Unveiling Transformers with LEGO: a synthetic reasoning task

1 code implementation9 Jun 2022 Yi Zhang, Arturs Backurs, Sébastien Bubeck, Ronen Eldan, Suriya Gunasekar, Tal Wagner

We study how the trained models eventually succeed at the task, and in particular, we manage to understand some of the attention heads as well as how the information flows in the network.

Learning to Execute

Data Augmentation as Feature Manipulation

no code implementations3 Mar 2022 Ruoqi Shen, Sébastien Bubeck, Suriya Gunasekar

In this work we consider another angle, and we study the effect of data augmentation on the dynamic of the learning process.

Data Augmentation

Adversarial Examples in Multi-Layer Random ReLU Networks

no code implementations NeurIPS 2021 Peter L. Bartlett, Sébastien Bubeck, Yeshwanth Cherapanamjeri

We consider the phenomenon of adversarial examples in ReLU networks with independent gaussian parameters.

A Universal Law of Robustness via Isoperimetry

no code implementations NeurIPS 2021 Sébastien Bubeck, Mark Sellke

Classically, data interpolation with a parametrized model class is possible as long as the number of parameters is larger than the number of equations to be satisfied.

A single gradient step finds adversarial examples on random two-layers neural networks

no code implementations NeurIPS 2021 Sébastien Bubeck, Yeshwanth Cherapanamjeri, Gauthier Gidel, Rémi Tachet des Combes

Daniely and Schacham recently showed that gradient descent finds adversarial examples on random undercomplete two-layers ReLU neural networks.

A law of robustness for two-layers neural networks

no code implementations30 Sep 2020 Sébastien Bubeck, Yuanzhi Li, Dheeraj Nagaraj

We make a precise conjecture that, for any Lipschitz activation function and for most datasets, any two-layers neural network with $k$ neurons that perfectly fit the data must have its Lipschitz constant larger (up to a constant) than $\sqrt{n/k}$ where $n$ is the number of datapoints.

Vocal Bursts Valence Prediction

Network size and weights size for memorization with two-layers neural networks

no code implementations4 Jun 2020 Sébastien Bubeck, Ronen Eldan, Yin Tat Lee, Dan Mikulincer

In contrast we propose a new training procedure for ReLU networks, based on complex (as opposed to real) recombination of the neurons, for which we show approximate memorization with both $O\left(\frac{n}{d} \cdot \frac{\log(1/\epsilon)}{\epsilon}\right)$ neurons, as well as nearly-optimal size of the weights.


Online Multiserver Convex Chasing and Optimization

no code implementations15 Apr 2020 Sébastien Bubeck, Yuval Rabani, Mark Sellke

We introduce the problem of $k$-chasing of convex functions, a simultaneous generalization of both the famous k-server problem in $R^d$, and of the problem of chasing convex bodies and functions.


Online Learning for Active Cache Synchronization

1 code implementation ICML 2020 Andrey Kolobov, Sébastien Bubeck, Julian Zimmert

Existing multi-armed bandit (MAB) models make two implicit assumptions: an arm generates a payoff only when it is played, and the agent observes every payoff that is generated.

How to trap a gradient flow

no code implementations9 Jan 2020 Sébastien Bubeck, Dan Mikulincer

This viewpoint was explored in 1993 by Vavasis, who proposed an algorithm which, for any fixed finite dimension $d$, improves upon the $O(1/\varepsilon^2)$ oracle complexity of gradient descent.

Complexity of Highly Parallel Non-Smooth Convex Optimization

no code implementations NeurIPS 2019 Sébastien Bubeck, Qijia Jiang, Yin Tat Lee, Yuanzhi Li, Aaron Sidford

Namely we consider optimization algorithms interacting with a highly parallel gradient oracle, that is one that can answer $\mathrm{poly}(d)$ gradient queries in parallel.

First-Order Bayesian Regret Analysis of Thompson Sampling

no code implementations2 Feb 2019 Sébastien Bubeck, Mark Sellke

Second we replace the entropy over combinatorial actions by a coordinate entropy, which allows us to obtain the first optimal worst-case bound for Thompson Sampling in the combinatorial setting.

Combinatorial Optimization Thompson Sampling

Improved Path-length Regret Bounds for Bandits

no code implementations29 Jan 2019 Sébastien Bubeck, Yuanzhi Li, Haipeng Luo, Chen-Yu Wei

We study adaptive regret bounds in terms of the variation of the losses (the so-called path-length bounds) for both multi-armed bandit and more generally linear bandit.

Adversarial Examples from Cryptographic Pseudo-Random Generators

no code implementations15 Nov 2018 Sébastien Bubeck, Yin Tat Lee, Eric Price, Ilya Razenshteyn

In our recent work (Bubeck, Price, Razenshteyn, arXiv:1805. 10204) we argued that adversarial examples in machine learning might be due to an inherent computational hardness of the problem.

Binary Classification General Classification

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

no code implementations NeurIPS 2018 Kevin Scaman, Francis Bach, Sébastien Bubeck, Yin Tat Lee, Laurent Massoulié

Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.

Optimization and Control

Adversarial examples from computational constraints

no code implementations25 May 2018 Sébastien Bubeck, Eric Price, Ilya Razenshteyn

First we prove that, for a broad set of classification tasks, the mere existence of a robust classifier implies that it can be found by a possibly exponential-time algorithm with relatively few training examples.

Binary Classification Classification +1

Make the Minority Great Again: First-Order Regret Bound for Contextual Bandits

no code implementations ICML 2018 Zeyuan Allen-Zhu, Sébastien Bubeck, Yuanzhi Li

Regret bounds in online learning compare the player's performance to $L^*$, the optimal performance in hindsight with a fixed strategy.

Multi-Armed Bandits

Sparsity, variance and curvature in multi-armed bandits

no code implementations3 Nov 2017 Sébastien Bubeck, Michael B. Cohen, Yuanzhi Li

In (online) learning theory the concepts of sparsity, variance and curvature are well-understood and are routinely used to obtain refined regret and generalization bounds.

Generalization Bounds Learning Theory +1

Multi-scale Online Learning and its Applications to Online Auctions

no code implementations26 May 2017 Sébastien Bubeck, Nikhil R. Devanur, Zhiyi Huang, Rad Niazadeh

For the online posted pricing problem, we show regret bounds that scale with the best fixed price, rather than the range of the values.

Optimal algorithms for smooth and strongly convex distributed optimization in networks

1 code implementation ICML 2017 Kevin Scaman, Francis Bach, Sébastien Bubeck, Yin Tat Lee, Laurent Massoulié

For centralized (i. e. master/slave) algorithms, we show that distributing Nesterov's accelerated gradient descent is optimal and achieves a precision $\varepsilon > 0$ in time $O(\sqrt{\kappa_g}(1+\Delta\tau)\ln(1/\varepsilon))$, where $\kappa_g$ is the condition number of the (global) function to optimize, $\Delta$ is the diameter of the network, and $\tau$ (resp.

Distributed Optimization regression

Kernel-based methods for bandit convex optimization

no code implementations11 Jul 2016 Sébastien Bubeck, Ronen Eldan, Yin Tat Lee

We consider the adversarial convex bandit problem and we build the first $\mathrm{poly}(T)$-time algorithm with $\mathrm{poly}(n) \sqrt{T}$-regret for this problem.

Black-box optimization with a politician

no code implementations15 Feb 2016 Sébastien Bubeck, Yin-Tat Lee

We propose a new framework for black-box convex optimization which is well-suited for situations where gradient computations are expensive.

Multi-scale exploration of convex functions and bandit convex optimization

no code implementations23 Jul 2015 Sébastien Bubeck, Ronen Eldan

We construct a new map from a convex function to a distribution on its domain, with the property that this distribution is a multi-scale exploration of the function.

Sampling from a log-concave distribution with Projected Langevin Monte Carlo

no code implementations9 Jul 2015 Sébastien Bubeck, Ronen Eldan, Joseph Lehec

We extend the Langevin Monte Carlo (LMC) algorithm to compactly supported measures via a projection step, akin to projected Stochastic Gradient Descent (SGD).

A geometric alternative to Nesterov's accelerated gradient descent

no code implementations26 Jun 2015 Sébastien Bubeck, Yin Tat Lee, Mohit Singh

The new algorithm has a simple geometric interpretation, loosely inspired by the ellipsoid method.

Bandit Convex Optimization: sqrt{T} Regret in One Dimension

no code implementations23 Feb 2015 Sébastien Bubeck, Ofer Dekel, Tomer Koren, Yuval Peres

We analyze the minimax regret of the adversarial bandit convex optimization problem.

Thompson Sampling

The entropic barrier: a simple and optimal universal self-concordant barrier

no code implementations4 Dec 2014 Sébastien Bubeck, Ronen Eldan

We prove that the Cram\'er transform of the uniform measure on a convex body in $\mathbb{R}^n$ is a $(1+o(1)) n$-self-concordant barrier, improving a seminal result of Nesterov and Nemirovski.

Convex Optimization: Algorithms and Complexity

3 code implementations20 May 2014 Sébastien Bubeck

In stochastic optimization we discuss stochastic gradient descent, mini-batches, random coordinate descent, and sublinear algorithms.

Stochastic Optimization

Most Correlated Arms Identification

no code implementations23 Apr 2014 Che-Yu Liu, Sébastien Bubeck

We study the problem of finding the most mutually correlated arms among many arms.

lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits

no code implementations27 Dec 2013 Kevin Jamieson, Matthew Malloy, Robert Nowak, Sébastien Bubeck

The paper proposes a novel upper confidence bound (UCB) procedure for identifying the arm with the largest mean in a multi-armed bandit game in the fixed confidence setting using a small number of total samples.

Multi-Armed Bandits

Prior-free and prior-dependent regret bounds for Thompson Sampling

no code implementations NeurIPS 2013 Sébastien Bubeck, Che-Yu Liu

Building on the techniques of Audibert and Bubeck [2009] and Russo and Roy [2013] we first show that Thompson Sampling attains an optimal prior-free bound in the sense that for any prior distribution its Bayesian regret is bounded from above by $14 \sqrt{n K}$.

Thompson Sampling

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems

no code implementations25 Apr 2012 Sébastien Bubeck, Nicolò Cesa-Bianchi

Multi-armed bandit problems are the most basic examples of sequential decision problems with an exploration-exploitation trade-off.

Multi-Bandit Best Arm Identification

no code implementations NeurIPS 2011 Victor Gabillon, Mohammad Ghavamzadeh, Alessandro Lazaric, Sébastien Bubeck

We first propose an algorithm called Gap-based Exploration (GapE) that focuses on the arms whose mean is close to the mean of the best arm in the same bandit (i. e., small gap).

Online Optimization in X-Armed Bandits

no code implementations NeurIPS 2008 Sébastien Bubeck, Gilles Stoltz, Csaba Szepesvári, Rémi Munos

We consider a generalization of stochastic bandit problems where the set of arms, X, is allowed to be a generic topological space.

Cannot find the paper you are looking for? You can Submit a new open access paper.