Search Results for author: Nicolas Gast

Found 10 papers, 3 papers with code

Model Predictive Control is Almost Optimal for Restless Bandit

no code implementations8 Oct 2024 Nicolas Gast, Dheeraj Narasimha

Our solution requires minimal assumptions and quantifies the loss in optimality in terms of $\tau$ and the number of arms, $N$.

Model Predictive Control

Computing the Bias of Constant-step Stochastic Approximation with Markovian Noise

1 code implementation23 May 2024 Sebastian Allmeier, Nicolas Gast

Furthermore, we show that the time-averaged bias is equal to $\alpha V + O(\alpha^2)$, where $V$ is a constant characterized by a Lyapunov equation, showing that $\mathbb{E}[\bar{\theta}_n] \approx \theta^*+V\alpha + O(\alpha^2)$, where $\bar{\theta}_n=(1/n)\sum_{k=1}^n\theta_k$ is the Polyak-Ruppert average.

Decentralized model-free reinforcement learning in stochastic games with average-reward objective

no code implementations13 Jan 2023 Romain Cravic, Nicolas Gast, Bruno Gaujal

We propose the first model-free algorithm that achieves low regret performance for decentralized learning in two-player zero-sum tabular stochastic games with infinite-horizon average-reward objective.

Q-Learning reinforcement-learning +1

On Fair Selection in the Presence of Implicit and Differential Variance

no code implementations10 Dec 2021 Vitalii Emelianov, Nicolas Gast, Krishna P. Gummadi, Patrick Loiseau

In the second setting (with known variances), imposing the $\gamma$-rule decreases the utility but we prove a bound on the utility loss due to the fairness mechanism.


Reinforcement Learning for Markovian Bandits: Is Posterior Sampling more Scalable than Optimism?

no code implementations16 Jun 2021 Nicolas Gast, Bruno Gaujal, Kimang Khun

While the regret bound and runtime of vanilla implementations of PSRL and UCRL2 are exponential in the number of bandits, we show that the episodic regret of MB-PSRL and MB-UCRL2 is $\tilde{O}(S\sqrt{nK})$ where $K$ is the number of episodes, $n$ is the number of bandits and $S$ is the number of states of each bandit (the exact bound in S, n and K is given in the paper).

reinforcement-learning Reinforcement Learning (RL)

Exponential Convergence Rate for the Asymptotic Optimality of Whittle Index Policy

no code implementations16 Dec 2020 Nicolas Gast, Bruno Gaujal, Chen Yan

In this paper we show that, under the same conditions, the convergence rate is exponential in the number of bandits, unless the fixed point is singular (to be defined later).

Performance Optimization and Control Probability

On Fair Selection in the Presence of Implicit Variance

no code implementations24 Jun 2020 Vitalii Emelianov, Nicolas Gast, Krishna P. Gummadi, Patrick Loiseau

We then compare the utility obtained by imposing a fairness mechanism that we term $\gamma$-rule (it includes demographic parity and the four-fifths rule as special cases), to that of a group-oblivious selection algorithm that picks the candidates with the highest estimated quality independently of their group.


The Price of Local Fairness in Multistage Selection

1 code implementation15 Jun 2019 Vitalii Emelianov, George Arvanitakis, Nicolas Gast, Krishna Gummadi, Patrick Loiseau

In particular, our experiments show that the price of local fairness is generally smaller when the sensitive attribute is observed at the first stage; but globally fair selections are more locally fair when the sensitive attribute is observed at the second stage---hence in both cases it is often possible to have a selection that has a small price of local fairness and is close to locally fair.

Attribute Decision Making +1

Linear Regression from Strategic Data Sources

1 code implementation30 Sep 2013 Nicolas Gast, Stratis Ioannidis, Patrick Loiseau, Benjamin Roussillon

In this paper, we study a setting in which features are public but individuals choose the precision of the outputs they reveal to an analyst.


Cannot find the paper you are looking for? You can Submit a new open access paper.