no code implementations • 5 Nov 2024 • Shiyun Lin, Simon Mauras, Nadav Merlis, Vianney Perchet

We aim to guarantee each worker the largest possible share from the utility in her best possible stable matching.

no code implementations • 30 Aug 2024 • Ahmed Ben Yahmed, Clément Calauzènes, Vianney Perchet

In the strategic multi-armed bandit setting, when arms possess perfect information about the player's behavior, they can establish an equilibrium where: 1. they retain almost all of their value, 2. they leave the player with a substantial (linear) regret.

no code implementations • 17 Jun 2024 • Matilde Tullii, Solenne Gaucher, Nadav Merlis, Vianney Perchet

For this model, our algorithm obtains a regret $\tilde{\mathcal{O}}(T^{d+2\beta/d+3\beta})$, where $d$ is the dimension of the context space.

no code implementations • 2 May 2024 • Ziyad Benomar, Vianney Perchet

The non-clairvoyant scheduling problem has gained new interest within learning-augmented algorithms, where the decision-maker is equipped with predictions without any quality guarantees.

no code implementations • 18 Mar 2024 • Nadav Merlis, Dorian Baudry, Vianney Perchet

In particular, we measure the ratio between the value of standard RL agents and that of agents with partial future-reward lookahead.

no code implementations • 20 Feb 2024 • Charles Arnal, Vivien Cabannes, Vianney Perchet

The combination of lightly supervised pre-training and online fine-tuning has played a key role in recent AI developments.

no code implementations • 1 Sep 2023 • Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko

We study how to learn $\epsilon$-optimal strategies in zero-sum imperfect information games (IIG) with trajectory feedback.

no code implementations • NeurIPS 2023 • Mathieu Molina, Nicolas Gast, Patrick Loiseau, Vianney Perchet

We consider the problem of online allocation subject to a long-term fairness penalty.

no code implementations • 3 Jun 2023 • Felipe Garrido-Lucero, Benjamin Heymann, Maxime Vono, Patrick Loiseau, Vianney Perchet

We consider the dataset valuation problem, that is, the problem of quantifying the incremental gain, to some relevant pre-defined utility of a machine learning task, of aggregating an individual dataset to others.

no code implementations • 31 May 2023 • Hugo Richard, Etienne Boursier, Vianney Perchet

This motivates the harder, asynchronous multiplayer bandits problem, which was first tackled with an explore-then-commit (ETC) algorithm (see Dakdouk, 2022), with a regret upper-bound in $\mathcal{O}(T^{\frac{2}{3}})$.

1 code implementation • 23 Dec 2022 • Côme Fiegel, Pierre Ménard, Tadashi Kozuno, Rémi Munos, Vianney Perchet, Michal Valko

Imperfect information games (IIG) are games in which each player only partially observes the current game state.

no code implementations • 29 Nov 2022 • Etienne Boursier, Vianney Perchet

Due mostly to its application to cognitive radio networks, multiplayer bandits gained a lot of interest in the last decade.

no code implementations • 23 Oct 2022 • Sasila Ilandarideva, Yannis Bekri, Anatoli Juditsky, Vianney Perchet

In this paper we discuss an application of Stochastic Approximation to statistical estimation of high-dimensional sparse parameters.

1 code implementation • 31 May 2022 • Nadav Merlis, Hugo Richard, Flore Sentenac, Corentin Odic, Mathieu Molina, Vianney Perchet

We study single-machine scheduling of jobs, each belonging to a job type that determines its duration distribution.

1 code implementation • 26 May 2022 • Vivien Cabannes, Francis Bach, Vianney Perchet, Alessandro Rudi

The workhorse of machine learning is stochastic gradient descent.

no code implementations • 15 Feb 2022 • Vianney Perchet, Philippe Rigollet, Thibaut Le Gouic

In the case of asymmetric values where optimal solutions need not exist but Nash equilibria do, our algorithm samples from an $\varepsilon$-Nash equilibrium with similar complexity but where implicit constants depend on various parameters of the game such as battlefield values.

no code implementations • 11 Dec 2021 • Evrard Garcelon, Kamalika Chaudhuri, Vianney Perchet, Matteo Pirotta

Contextual bandit algorithms are widely used in domains where it is desirable to provide a personalized service by leveraging contextual information, that may contain sensitive information that needs to be protected.

no code implementations • NeurIPS 2021 • Reda Ouhamma, Odalric Maillard, Vianney Perchet

We consider the problem of online linear regression in the stochastic setting.

no code implementations • NeurIPS 2021 • Reda Ouhamma, Rémy Degenne, Pierre Gaillard, Vianney Perchet

In the fixed budget thresholding bandit problem, an algorithm sequentially allocates a budgeted number of samples to different distributions.

no code implementations • 31 Jul 2021 • Flore Sentenac, Jialin Yi, Clément Calauzènes, Vianney Perchet, Milan Vojnovic

Finding an optimal matching in a weighted graph is a standard combinatorial problem.

no code implementations • NeurIPS 2021 • Nathan Noiry, Flore Sentenac, Vianney Perchet

Motivated by sequential budgeted allocation problems, we investigate online matching problems where connections between vertices are not i. i. d., but they have fixed degree distributions -- the so-called configuration model.

no code implementations • 10 Jun 2021 • Firas Jarboui, Vianney Perchet

We introduce a new procedure to neuralize unsupervised Hidden Markov Models in the continuous case.

no code implementations • 9 Jun 2021 • Firas Jarboui, Vianney Perchet

Current solutions either solve a behaviour cloning problem (which does not leverage the exploratory data) or a reinforced imitation learning problem (using a fixed cost function that discriminates available exploratory trajectories from expert ones).

no code implementations • NeurIPS 2021 • Flore Sentenac, Etienne Boursier, Vianney Perchet

In the centralized case, the number of accumulated packets remains bounded (i. e., the system is \textit{stable}) as long as the ratio between service rates and arrival rates is larger than $1$.

no code implementations • 25 May 2021 • Firas Jarboui, Vianney Perchet

The gloabal objective of inverse Reinforcement Learning (IRL) is to estimate the unknown cost function of some MDP base on observed trajectories generated by (approximate) optimal policies.

no code implementations • 17 Mar 2021 • Evrard Garcelon, Vianney Perchet, Matteo Pirotta

A critical aspect of bandit methods is that they require to observe the contexts --i. e., individual or group-level data-- and rewards in order to solve the sequential problem.

1 code implementation • NeurIPS 2021 • Etienne Boursier, Tristan Garrec, Vianney Perchet, Marco Scarsini

If she accepts the proposal, she is busy for the duration of the task and obtains a reward that depends on the task duration.

no code implementations • 4 Jan 2021 • Matthieu Jedor, Jonathan Louëdec, Vianney Perchet

On the other hand, this heuristic performs reasonably well in practice and it even has sublinear, and even near-optimal, regret bounds in some very specific linear contextual and Bayesian bandit models.

no code implementations • 1 Jan 2021 • Firas Jarboui, Vianney Perchet

We consider the quickest change detection problem where both the parameters of pre- and post- change distributions are unknown, which prevent the use of classical simple hypothesis testing.

no code implementations • 28 Dec 2020 • Matthieu Jedor, Jonathan Louëdec, Vianney Perchet

Continuously learning and leveraging the knowledge accumulated from prior tasks in order to improve future performance is a long standing machine learning problem.

no code implementations • NeurIPS 2020 • Sandrine Peche, Vianney Perchet

We consider the stochastic block model where connection between vertices is perturbed by some latent (and unobserved) random geometric graph.

no code implementations • NeurIPS 2021 • Evrard Garcelon, Vianney Perchet, Ciara Pike-Burke, Matteo Pirotta

Motivated by this, we study privacy in the context of finite-horizon Markov Decision Processes (MDPs) by requiring information to be obfuscated on the user side.

no code implementations • 20 Jul 2020 • Etienne Boursier, Vianney Perchet, Marco Scarsini

In the simple uni-dimensional and static setting, beliefs about the quality are known to converge to its true value.

no code implementations • NeurIPS 2020 • Pierre Perrault, Etienne Boursier, Vianney Perchet, Michal Valko

In CMAB, the question of the existence of an efficient policy with an optimal asymptotic regret (up to a factor poly-logarithmic with the action size) is still open for many families of distributions, including mutually independent outcomes, and more generally the multivariate sub-Gaussian family.

1 code implementation • NeurIPS 2019 • Matthieu Jedor, Jonathan Louedec, Vianney Perchet

We introduce a new stochastic multi-armed bandit setting where arms are grouped inside ``ordered'' categories.

no code implementations • 4 Feb 2020 • Etienne Boursier, Vianney Perchet

We provide the first algorithm robust to selfish players (a. k. a.

no code implementations • 25 Sep 2019 • Firas Jarboui, Vianney Perchet, Roman EGGER

Expanding Non Markovian Reward Decision Processes (NMRDP) into Markov Decision Processes (MDP) enables the use of state of the art Reinforcement Learning (RL) techniques to identify optimal policies.

no code implementations • 10 Jul 2019 • Firas Jarboui, Célya Gruson-daniel, Pierre Chanial, Alain Durmus, Vincent Rocchisani, Sophie-helene Goulet Ebongue, Anneliese Depoux, Wilfried Kirschenmann, Vianney Perchet

Studies on massive open online courses (MOOCs) users discuss the existence of typical profiles and their impact on the learning process of the students.

no code implementations • 20 Jun 2019 • Xavier Fontaine, Pierre Perrault, Michal Valko, Vianney Perchet

By trying to minimize the $\ell^2$-loss $\mathbb{E} [\lVert\hat{\beta}-\beta^{\star}\rVert^2]$ the decision maker is actually minimizing the trace of the covariance matrix of the problem, which corresponds then to online A-optimal design.

no code implementations • NeurIPS 2021 • Nicolò Cesa-Bianchi, Tommaso Cesari, Yishay Mansour, Vianney Perchet

We introduce a novel theoretical framework for Return On Investment (ROI) maximization in repeated decision-making.

1 code implementation • 27 May 2019 • Etienne Boursier, Vianney Perchet

Strategic information is valuable either by remaining private (for instance if it is sensitive) or, on the other hand, by being used publicly to increase some utility.

no code implementations • 12 Feb 2019 • Xavier Fontaine, Shie Mannor, Vianney Perchet

This can be recast as a specific stochastic optimization problem where the objective is to maximize the cumulative reward, or equivalently to minimize the regret.

no code implementations • 11 Feb 2019 • Pierre Perrault, Vianney Perchet, Michal Valko

We improve the efficiency of algorithms for stochastic \emph{combinatorial semi-bandits}.

no code implementations • 4 Feb 2019 • Etienne Boursier, Emilie Kaufmann, Abbas Mehrabian, Vianney Perchet

We study a multiplayer stochastic multi-armed bandit problem in which players cannot communicate, and if two or more players pull the same arm, a collision occurs and the involved players receive zero reward.

no code implementations • 11 Oct 2018 • Xavier Fontaine, Quentin Berthet, Vianney Perchet

We consider the stochastic contextual bandit problem with additional regularization.

no code implementations • 9 Oct 2018 • Rémy Degenne, Thomas Nedelec, Clément Calauzènes, Vianney Perchet

State of the art online learning procedures focus either on selecting the best alternative ("best arm identification") or on minimizing the cost (the "regret").

1 code implementation • NeurIPS 2019 • Etienne Boursier, Vianney Perchet

Motivated by cognitive radio networks, we consider the stochastic multiplayer multi-armed bandit problem, where several players pull arms simultaneously and collisions occur if one of them is pulled by several players at the same stage.

no code implementations • 10 Jul 2018 • Rémy Degenne, Evrard Garcelon, Vianney Perchet

We consider the classical stochastic multi-armed bandit but where, from time to time and roughly with frequency $\epsilon$, an extra observation is gathered by the agent for free.

no code implementations • 9 Jul 2018 • Nicolò Cesa-Bianchi, Tommaso Cesari, Vianney Perchet

When $K=2$ in the distribution-dependent case, the hardness of our setting reduces to that of a stochastic $2$-armed bandit: we prove that an upper bound of order $(\log T)/\Delta$ (up to $\log\log$ factors) on the regret can be achieved with no information on the demand curve.

no code implementations • 6 Jun 2018 • Pierre Perrault, Vianney Perchet, Michal Valko

We consider the problem where an agent wants to find a hidden object that is randomly located in some vertex of a directed acyclic graph (DAG) according to a fixed but possibly unknown distribution.

no code implementations • 28 Jun 2017 • Claire Vernade, Olivier Cappé, Vianney Perchet

We assume that the probability of conversion associated with each action is unknown while the distribution of the conversion delay is known, distinguishing between the (idealized) case where the conversion events may be observed whatever their delay and the more realistic setting in which late conversions are censored.

no code implementations • 5 Jun 2017 • Joon Kwon, Vianney Perchet, Claire Vernade

In the classical multi-armed bandit problem, d arms are available to the decision maker who pulls them sequentially in order to maximize his cumulative reward.

no code implementations • 3 Apr 2017 • Thomas Nedelec, Nicolas Le Roux, Vianney Perchet

We provide a comparative study of several widely used off-policy estimators (Empirical Average, Basic Importance Sampling and Normalized Importance Sampling), detailing the different regimes where they are individually suboptimal.

no code implementations • NeurIPS 2017 • Quentin Berthet, Vianney Perchet

We consider the problem of bandit optimization, inspired by stochastic optimization and online learning problems with bandit feedback.

no code implementations • NeurIPS 2016 • Rémy Degenne, Vianney Perchet

We introduce a way to quantify the dependency structure of the problem and design an algorithm that adapts to it.

no code implementations • 28 Sep 2016 • János Flesch, Rida Laraki, Vianney Perchet

The third is necessary: if it is not satisfied, the opponent can weakly exclude the target set.

no code implementations • 26 May 2016 • Francis Bach, Vianney Perchet

The minimization of convex functions which are only available through partial and noisy information is a key methodological problem in many disciplines.

no code implementations • 26 Nov 2015 • Joon Kwon, Vianney Perchet

We demonstrate that, in the classical non-stochastic regret minimization problem with $d$ decisions, gains and losses to be respectively maximized or minimized are fundamentally different.

no code implementations • 18 Nov 2015 • Jonathan Weed, Vianney Perchet, Philippe Rigollet

To our knowledge, this is the first complete set of strategies for bidders participating in auctions of this type.

no code implementations • 10 Feb 2014 • Shie Mannor, Vianney Perchet, Gilles Stoltz

We show that it is impossible, in general, to approach the best target set in hindsight and propose achievable though ambitious alternative goals.

no code implementations • 19 Nov 2013 • Emile Contal, Vianney Perchet, Nicolas Vayatis

In this paper, we analyze a generic algorithm scheme for sequential global optimization using Gaussian processes.

no code implementations • 23 May 2013 • Shie Mannor, Vianney Perchet, Gilles Stoltz

In this paper we provide primal conditions on a convex set to be approachable with partial monitoring.

no code implementations • 27 Oct 2011 • Vianney Perchet, Philippe Rigollet

We consider a multi-armed bandit problem in a setting where each arm produces a noisy reward realization which depends on an observable random covariate.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.