no code implementations • 30 Oct 2024 • Stefan Stojanovic, Yassir Jedra, Alexandre Proutiere
The algorithm exploits these new samples to complete the matrix estimation using a CUR-like method.
no code implementations • 21 Jul 2024 • Frédéric Zheng, Alexandre Proutiere
This gap strongly depends on the mixing properties of the underlying Markov chain, and we prove that it typically scales as $\sqrt{t_\mathrm{mix}\ln(n)/n}$ (where $t_\mathrm{mix}$ is the mixing time of the chain).
1 code implementation • NeurIPS 2023 • Alessio Russo, Alexandre Proutiere
We study the problem of exploration in Reinforcement Learning and present a novel model-free solution.
1 code implementation • 24 Feb 2024 • Yassir Jedra, William Réveillard, Stefan Stojanovic, Alexandre Proutiere
For policy evaluation and best policy identification, we show that our algorithms are nearly minimax optimal.
1 code implementation • NeurIPS 2023 • Po-An Wang, Ruo-Chun Tzeng, Alexandre Proutiere
In particular, we present \sred (Continuous Rejects), a truly adaptive algorithm that can reject arms in {\it any} round based on the observed empirical gaps between the rewards of various arms.
no code implementations • 7 Oct 2023 • Damianos Tranos, Alexandre Proutiere
Our result relies on a recently proposed exponential decay of sensitivity property and, to the best of our knowledge, is the first of its kind in this setting.
no code implementations • 23 Aug 2023 • Po-An Wang, Kaito Ariu, Alexandre Proutiere
For the problem with two arms, also known as the A/B testing problem, we prove that there is no algorithm that (i) performs as well as the algorithm sampling each arm equally (referred to as the {\it uniform sampling} algorithm) in all instances, and that (ii) strictly outperforms uniform sampling on at least one instance.
no code implementations • 18 Jun 2023 • Kaito Ariu, Alexandre Proutiere, Se-Young Yun
In this paper, we investigate the problem of recovering hidden communities in the Labeled Stochastic Block Model (LSBM) with a finite number of clusters whose sizes grow linearly with the total number of nodes.
no code implementations • 5 Apr 2023 • Daniele Foffano, Alessio Russo, Alexandre Proutiere
Reinforcement Learning aims at identifying and evaluating efficient control policies from data.
1 code implementation • 28 Nov 2022 • Alessio Russo, Alexandre Proutiere
Arms consist of two components: one that is shared across tasks (that we call representation) and one that is task-specific (that we call predictor).
no code implementations • 2 Oct 2022 • Damianos Tranos, Alessio Russo, Alexandre Proutiere
We present Self-Tuning Tube-based Model Predictive Control (STT-MPC), an adaptive robust control algorithm for uncertain linear systems with additive disturbances based on the least-squares estimator and polytopic tubes.
1 code implementation • 7 Sep 2022 • Alessio Russo, Alexandre Proutiere
We present a novel tube-based data-driven predictive control method for linear systems affected by a bounded addictive disturbance.
no code implementations • 11 Aug 2022 • Jerome Taupin, Yassir Jedra, Alexandre Proutiere
We investigate the problem of best policy identification in discounted linear Markov Decision Processes in the fixed confidence setting under a generative model.
no code implementations • 14 Apr 2022 • Simon Lindståhl, Alexandre Proutiere, Andreas Johnsson
The objective is to devise a joint measurement and decision strategy that returns a correct decision (e. g., the least loaded slice) with a certain level of confidence while minimizing the measurement cost (the number of measurements made before committing to the decision).
no code implementations • 6 Jan 2022 • Filippo Vannella, Alexandre Proutiere, Yassir Jedra, Jaeseong Jeong
In this paper, we devise algorithms learning optimal tilt control policies from existing data (in the so-called passive learning setting) or from data actively generated by the algorithms (the active learning setting).
no code implementations • NeurIPS 2021 • Po-An Wang, Ruo-Chun Tzeng, Alexandre Proutiere
For this problem, instance-specific lower bounds on the expected sample complexity reveal the optimal proportions of arm draws an Oracle algorithm would apply.
no code implementations • 29 Sep 2021 • Deming Yuan, Lei Wang, Alexandre Proutiere, Guodong Shi
Zeroth-order optimization has become increasingly important in complex optimization and machine learning when cost functions are impossible to be described in closed analytical forms.
no code implementations • 29 Sep 2021 • Yassir Jedra, Alexandre Proutiere
Quantifying the impact of such a constantly-varying control policy on the performance of these estimates and on the regret constitutes one of the technical challenges tackled in this paper.
1 code implementation • 15 Sep 2021 • Alessio Russo, Alexandre Proutiere
In such an attack, drawing inspiration from adversarial examples used in supervised learning, the amplitude of the adversarial perturbation is limited according to some norm, with the hope that this constraint will make the attack imperceptible.
no code implementations • 13 Sep 2021 • Stefan Magureanu, Alexandre Proutiere, Marcus Isaksson, Boxun Zhang
In absence of any contextual information about the query, one often has to adhere to the {\it diversity} principle, i. e., to return a list covering the various possible topics or meanings of the query.
no code implementations • 27 Jun 2021 • Damianos Tranos, Alexandre Proutiere
We consider Markov Decision Processes (MDPs) with deterministic transitions and study the problem of regret minimization, which is central to the analysis and design of optimal learning algorithms.
no code implementations • NeurIPS 2021 • Aymen Al Marjani, Aurélien Garivier, Alexandre Proutiere
We investigate the classical active pure exploration problem in Markov Decision Processes, where the agent sequentially selects actions and, from the resulting system trajectory, aims at identifying the best policy as fast as possible.
no code implementations • 10 Mar 2021 • Alessio Russo, Marco Molinari, Alexandre Proutiere
This work investigates the feasibility of using input-output data-driven control techniques for building control and their susceptibility to data-poisoning techniques.
no code implementations • 10 Mar 2021 • Alessio Russo, Alexandre Proutiere
This paper investigates poisoning attacks against data-driven control methods.
1 code implementation • 2 Mar 2021 • Alessio Russo, Alexandre Proutiere
In contrast to previous work on privacy, we study the problem for an online sequence of data.
no code implementations • 28 Sep 2020 • Aymen Al Marjani, Alexandre Proutiere
We then provide a simple and tight upper bound of the sample complexity lower bound, whose corresponding nearly-optimal sample allocation becomes explicit.
no code implementations • NeurIPS 2020 • Yassir Jedra, Alexandre Proutiere
We study the problem of best-arm identification with fixed confidence in stochastic linear bandits.
no code implementations • 21 May 2020 • Filippo Vannella, Jaeseong Jeong, Alexandre Proutiere
In this paper, we circumvent these issues by learning an improved policy in an offline manner using existing data collected on real networks.
no code implementations • 2 Apr 2020 • Simon Lindståhl, Alexandre Proutiere, Andreas Johnsson
In each round, the decision maker first decides whether to gather information about the rewards of particular arms (so that their rewards in this round can be predicted).
no code implementations • 17 Mar 2020 • Yassir Jedra, Alexandre Proutiere
We present a new finite-time analysis of the estimation error of the Ordinary Least Squares (OLS) estimator for stable linear time-invariant systems.
no code implementations • 20 Dec 2019 • Deming Yuan, Alexandre Proutiere, Guodong Shi
When the loss functions are strongly convex, we establish improved regret and constraint violation upper bounds in $\mathcal{O}(\log(T))$ and $\mathcal{O}(\sqrt{T\log(T)})$.
1 code implementation • NeurIPS 2019 • Se-Young Yun, Alexandre Proutiere
We derive information-theoretical upper bounds on the cluster recovery rate.
no code implementations • 14 Oct 2019 • Kaito Ariu, Jungseul Ok, Alexandre Proutiere, Se-Young Yun
The objective is to devise an algorithm with a minimal cluster recovery error rate.
no code implementations • 28 Sep 2019 • Alexandre Proutiere, Po-An Wang
We present DPE (Decentralized Parsimonious Exploration), a decentralized algorithm that achieves the same regret as that obtained by an optimal centralized algorithm.
no code implementations • 31 Jul 2019 • Alessio Russo, Alexandre Proutiere
Finally, we show that from the main agent perspective, the system uncertainties and the attacker can be modeled as a Partially Observable Markov Decision Process.
no code implementations • 27 Jun 2019 • Nikolai Matni, Alexandre Proutiere, Anders Rantzer, Stephen Tu
Machine and reinforcement learning (RL) are increasingly being applied to plan and control the behavior of autonomous systems interacting with the physical world.
no code implementations • 25 Mar 2019 • Yassir Jedra, Alexandre Proutiere
For controlled systems, our lower bounds are not as explicit as in the case of uncontrolled systems, but could well provide interesting insights into the design of control policy with minimal sample complexity.
no code implementations • 13 Feb 2019 • Deming Yuan, Alexandre Proutiere, Guodong Shi
We propose simple and natural distributed regression algorithms, involving, at each node and in each round, a local gradient descent step and a communication and averaging step where nodes aim at aligning their predictors to those of their neighbors.
no code implementations • 2 Jul 2018 • Erik Lindén, Jonas Sjöstrand, Alexandre Proutiere
We contribute to gaze tracking research by empirically showing that personal variations are well-modeled as a 3-dimensional latent parameter space for each eye.
no code implementations • NeurIPS 2018 • Jungseul Ok, Alexandre Proutiere, Damianos Tranos
For Lipschitz MDPs, the bounds are shown not to scale with the sizes $S$ and $A$ of the state and action spaces, i. e., they are smaller than $c\log T$ where $T$ is the time horizon and the constant $c$ only depends on the Lipschitz structure, the span of the bias function, and the minimal action sub-optimality gap.
no code implementations • NeurIPS 2017 • Richard Combes, Stefan Magureanu, Alexandre Proutiere
This paper introduces and addresses a wide class of stochastic bandit problems where the function mapping the arm to the corresponding reward exhibits some known structural properties.
no code implementations • NeurIPS 2015 • Se-Young Yun, Marc Lelarge, Alexandre Proutiere
This means that its average mean-square error converges to 0 as $m$ and $n$ grow large (i. e., $\|\hat{M}^{(k)}-M^{(k)} \|_F^2 = o(mn)$ with high probability, where $\hat{M}^{(k)}$ and $M^{(k)}$ denote the output of SLA and the optimal rank $k$ approximation of $M$, respectively).
no code implementations • NeurIPS 2016 • Se-Young Yun, Alexandre Proutiere
We find the set of parameters such that there exists a clustering algorithm with at most $s$ misclassified items in average under the general LSBM and for any $s=o(n)$, which solves one open problem raised in \cite{abbe2015community}.
no code implementations • 12 Jul 2015 • Jaeseong Jeong, Mathieu Leconte, Alexandre Proutiere
In this paper, we develop cluster-aided predictors that exploit past trajectories collected from all users to predict the next location of a given user.
no code implementations • 13 Apr 2015 • Se-Young Yun, Marc Lelarge, Alexandre Proutiere
We propose a streaming algorithm which produces an estimate of the original matrix with a vanishing mean square error, uses memory space scaling linearly with the ambient dimension of the matrix, i. e. the memory required to store the output alone, and spends computations as much as the number of non-zero entries of the input matrix.
1 code implementation • NeurIPS 2015 • Richard Combes, M. Sadegh Talebi, Alexandre Proutiere, Marc Lelarge
In the adversarial setting under bandit feedback, we propose \textsc{CombEXP}, an algorithm with the same regret scaling as state-of-the-art algorithms, but with lower computational complexity for some combinatorial problems.
no code implementations • 23 Dec 2014 • Se-Young Yun, Alexandre Proutiere
We consider the problem of community detection in the Stochastic Block Model with a finite number $K$ of communities of sizes linearly growing with the network size $n$.
Social and Information Networks Data Structures and Algorithms
no code implementations • NeurIPS 2014 • Se-Young Yun, Marc Lelarge, Alexandre Proutiere
The first algorithm is {\it offline}, as it needs to store and keep the assignments of nodes to clusters, and requires a memory that scales linearly with the network size.
no code implementations • 28 Jun 2014 • Richard Combes, Alexandre Proutiere
To our knowledge, the SP algorithm constitutes the first sequential arm selection rule that achieves a regret and optimization error scaling as $O(\sqrt{T})$ and $O(1/\sqrt{T})$, respectively, up to a logarithmic factor for non-smooth expected reward functions, as well as for smooth functions with unknown smoothness.
no code implementations • 20 May 2014 • Richard Combes, Alexandre Proutiere
We also provide a regret upper bound for OSUB in non-stationary environments where the expected rewards smoothly evolve over time.
no code implementations • 19 May 2014 • Stefan Magureanu, Richard Combes, Alexandre Proutiere
For discrete Lipschitz bandits, we derive asymptotic problem specific lower bounds for the regret satisfied by any algorithm, and propose OSLB and CKL-UCB, two algorithms that efficiently exploit the Lipschitz structure of the problem.
no code implementations • 23 Feb 2014 • Richard Combes, Alexandre Proutiere
In turn, the proposed algorithms optimally exploit the inherent structure of the throughput.
no code implementations • NeurIPS 2013 • Thomas Bonald, Alexandre Proutiere
This two-target algorithm achieves a long-term average regret in $\sqrt{2n}$ for a large parameter $m$ and a known time horizon $n$.
no code implementations • 27 Sep 2013 • M. Sadegh Talebi, Zhenhua Zou, Richard Combes, Alexandre Proutiere, Mikael Johansson
The parameters, and hence the optimal path, can only be estimated by routing packets through the network and observing the realized delays.
no code implementations • 27 Feb 2013 • Marc Lelarge, Alexandre Proutiere, M. Sadegh Talebi
We consider the problem of allocating radio channels to links in a wireless network.