Search Results for author: Alexandre Proutiere

Found 52 papers, 7 papers with code

Low-Rank Bandits via Tight Two-to-Infinity Singular Subspace Recovery

no code implementations24 Feb 2024 Yassir Jedra, William Réveillard, Stefan Stojanovic, Alexandre Proutiere

For policy evaluation and best policy identification, we show that our algorithms are nearly minimax optimal.

Multi-Armed Bandits

Best Arm Identification with Fixed Budget: A Large Deviation Perspective

1 code implementation NeurIPS 2023 Po-An Wang, Ruo-Chun Tzeng, Alexandre Proutiere

In particular, we present \sred (Continuous Rejects), a truly adaptive algorithm that can reject arms in {\it any} round based on the observed empirical gaps between the rewards of various arms.

Multi-Armed Bandits

Sub-linear Regret in Adaptive Model Predictive Control

no code implementations7 Oct 2023 Damianos Tranos, Alexandre Proutiere

Our result relies on a recently proposed exponential decay of sensitivity property and, to the best of our knowledge, is the first of its kind in this setting.

Model Predictive Control

On Uniformly Optimal Algorithms for Best Arm Identification in Two-Armed Bandits with Fixed Budget

no code implementations23 Aug 2023 Po-An Wang, Kaito Ariu, Alexandre Proutiere

We prove that there is no algorithm that (i) performs as well as the algorithm sampling each arm equally (this algorithm is referred to as the {\it uniform sampling} algorithm) on all instances, and that (ii) strictly outperforms this algorithm on at least one instance.

Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model

no code implementations18 Jun 2023 Kaito Ariu, Alexandre Proutiere, Se-Young Yun

To this end, we revisit instance-specific lower bounds on the expected number of misclassified items satisfied by any clustering algorithm.

Clustering Stochastic Block Model

On the Sample Complexity of Representation Learning in Multi-task Bandits with Global and Local structure

1 code implementation28 Nov 2022 Alessio Russo, Alexandre Proutiere

Arms consist of two components: one that is shared across tasks (that we call representation) and one that is task-specific (that we call predictor).

Representation Learning

Self-Tuning Tube-based Model Predictive Control

no code implementations2 Oct 2022 Damianos Tranos, Alessio Russo, Alexandre Proutiere

We present Self-Tuning Tube-based Model Predictive Control (STT-MPC), an adaptive robust control algorithm for uncertain linear systems with additive disturbances based on the least-squares estimator and polytopic tubes.

Model Predictive Control

Tube-Based Zonotopic Data-Driven Predictive Control

1 code implementation7 Sep 2022 Alessio Russo, Alexandre Proutiere

We present a novel tube-based data-driven predictive control method for linear systems affected by a bounded addictive disturbance.

Computational Efficiency

Best Policy Identification in Linear MDPs

no code implementations11 Aug 2022 Jerome Taupin, Yassir Jedra, Alexandre Proutiere

We investigate the problem of best policy identification in discounted linear Markov Decision Processes in the fixed confidence setting under a generative model.

Measurement-based Admission Control in Sliced Networks: A Best Arm Identification Approach

no code implementations14 Apr 2022 Simon Lindståhl, Alexandre Proutiere, Andreas Johnsson

The objective is to devise a joint measurement and decision strategy that returns a correct decision (e. g., the least loaded slice) with a certain level of confidence while minimizing the measurement cost (the number of measurements made before committing to the decision).

Learning Optimal Antenna Tilt Control Policies: A Contextual Linear Bandit Approach

no code implementations6 Jan 2022 Filippo Vannella, Alexandre Proutiere, Yassir Jedra, Jaeseong Jeong

In this paper, we devise algorithms learning optimal tilt control policies from existing data (in the so-called passive learning setting) or from data actively generated by the algorithms (the active learning setting).

Active Learning

Fast Pure Exploration via Frank-Wolfe

no code implementations NeurIPS 2021 Po-An Wang, Ruo-Chun Tzeng, Alexandre Proutiere

For this problem, instance-specific lower bounds on the expected sample complexity reveal the optimal proportions of arm draws an Oracle algorithm would apply.

Minimal Expected Regret in Linear Quadratic Control

no code implementations29 Sep 2021 Yassir Jedra, Alexandre Proutiere

Quantifying the impact of such a constantly-varying control policy on the performance of these estimates and on the regret constitutes one of the technical challenges tackled in this paper.

Distributed Zeroth-Order Optimization: Convergence Rates That Match Centralized Counterpart

no code implementations29 Sep 2021 Deming Yuan, Lei Wang, Alexandre Proutiere, Guodong Shi

Zeroth-order optimization has become increasingly important in complex optimization and machine learning when cost functions are impossible to be described in closed analytical forms.

Balancing detectability and performance of attacks on the control channel of Markov Decision Processes

1 code implementation15 Sep 2021 Alessio Russo, Alexandre Proutiere

In such an attack, drawing inspiration from adversarial examples used in supervised learning, the amplitude of the adversarial perturbation is limited according to some norm, with the hope that this constraint will make the attack imperceptible.

Reinforcement Learning (RL)

Online Learning of Optimally Diverse Rankings

no code implementations13 Sep 2021 Stefan Magureanu, Alexandre Proutiere, Marcus Isaksson, Boxun Zhang

In absence of any contextual information about the query, one often has to adhere to the {\it diversity} principle, i. e., to return a list covering the various possible topics or meanings of the query.

Learning-To-Rank

Regret Analysis in Deterministic Reinforcement Learning

no code implementations27 Jun 2021 Damianos Tranos, Alexandre Proutiere

We consider Markov Decision Processes (MDPs) with deterministic transitions and study the problem of regret minimization, which is central to the analysis and design of optimal learning algorithms.

reinforcement-learning Reinforcement Learning (RL)

Navigating to the Best Policy in Markov Decision Processes

no code implementations NeurIPS 2021 Aymen Al Marjani, Aurélien Garivier, Alexandre Proutiere

We investigate the classical active pure exploration problem in Markov Decision Processes, where the agent sequentially selects actions and, from the resulting system trajectory, aims at identifying the best policy as fast as possible.

Poisoning Attacks against Data-Driven Control Methods

no code implementations10 Mar 2021 Alessio Russo, Alexandre Proutiere

This paper investigates poisoning attacks against data-driven control methods.

LEMMA

Data-Driven Control and Data-Poisoning attacks in Buildings: the KTH Live-In Lab case study

no code implementations10 Mar 2021 Alessio Russo, Marco Molinari, Alexandre Proutiere

This work investigates the feasibility of using input-output data-driven control techniques for building control and their susceptibility to data-poisoning techniques.

Data Poisoning

Minimizing Information Leakage of Abrupt Changes in Stochastic Systems

1 code implementation2 Mar 2021 Alessio Russo, Alexandre Proutiere

In contrast to previous work on privacy, we study the problem for an online sequence of data.

Adaptive Sampling for Best Policy Identification in Markov Decision Processes

no code implementations28 Sep 2020 Aymen Al Marjani, Alexandre Proutiere

We then provide a simple and tight upper bound of the sample complexity lower bound, whose corresponding nearly-optimal sample allocation becomes explicit.

Optimal Best-arm Identification in Linear Bandits

no code implementations NeurIPS 2020 Yassir Jedra, Alexandre Proutiere

We study the problem of best-arm identification with fixed confidence in stochastic linear bandits.

Off-policy Learning for Remote Electrical Tilt Optimization

no code implementations21 May 2020 Filippo Vannella, Jaeseong Jeong, Alexandre Proutiere

In this paper, we circumvent these issues by learning an improved policy in an offline manner using existing data collected on real networks.

Predictive Bandits

no code implementations2 Apr 2020 Simon Lindståhl, Alexandre Proutiere, Andreas Johnsson

In each round, the decision maker first decides whether to gather information about the rewards of particular arms (so that their rewards in this round can be predicted).

Finite-time Identification of Stable Linear Systems: Optimality of the Least-Squares Estimator

no code implementations17 Mar 2020 Yassir Jedra, Alexandre Proutiere

We present a new finite-time analysis of the estimation error of the Ordinary Least Squares (OLS) estimator for stable linear time-invariant systems.

Distributed Online Optimization with Long-Term Constraints

no code implementations20 Dec 2019 Deming Yuan, Alexandre Proutiere, Guodong Shi

When the loss functions are strongly convex, we establish improved regret and constraint violation upper bounds in $\mathcal{O}(\log(T))$ and $\mathcal{O}(\sqrt{T\log(T)})$.

An Optimal Algorithm for Multiplayer Multi-Armed Bandits

no code implementations28 Sep 2019 Alexandre Proutiere, Po-An Wang

We present DPE (Decentralized Parsimonious Exploration), a decentralized algorithm that achieves the same regret as that obtained by an optimal centralized algorithm.

Multi-Armed Bandits

Optimal Attacks on Reinforcement Learning Policies

no code implementations31 Jul 2019 Alessio Russo, Alexandre Proutiere

Finally, we show that from the main agent perspective, the system uncertainties and the attacker can be modeled as a Partially Observable Markov Decision Process.

reinforcement-learning Reinforcement Learning (RL)

From self-tuning regulators to reinforcement learning and back again

no code implementations27 Jun 2019 Nikolai Matni, Alexandre Proutiere, Anders Rantzer, Stephen Tu

Machine and reinforcement learning (RL) are increasingly being applied to plan and control the behavior of autonomous systems interacting with the physical world.

reinforcement-learning Reinforcement Learning (RL)

Sample Complexity Lower Bounds for Linear System Identification

no code implementations25 Mar 2019 Yassir Jedra, Alexandre Proutiere

For controlled systems, our lower bounds are not as explicit as in the case of uncontrolled systems, but could well provide interesting insights into the design of control policy with minimal sample complexity.

valid

Distributed Online Linear Regression

no code implementations13 Feb 2019 Deming Yuan, Alexandre Proutiere, Guodong Shi

We propose simple and natural distributed regression algorithms, involving, at each node and in each round, a local gradient descent step and a communication and averaging step where nodes aim at aligning their predictors to those of their neighbors.

regression

Learning to Personalize in Appearance-Based Gaze Tracking

no code implementations2 Jul 2018 Erik Lindén, Jonas Sjöstrand, Alexandre Proutiere

We contribute to gaze tracking research by empirically showing that personal variations are well-modeled as a 3-dimensional latent parameter space for each eye.

Gaze Estimation

Exploration in Structured Reinforcement Learning

no code implementations NeurIPS 2018 Jungseul Ok, Alexandre Proutiere, Damianos Tranos

For Lipschitz MDPs, the bounds are shown not to scale with the sizes $S$ and $A$ of the state and action spaces, i. e., they are smaller than $c\log T$ where $T$ is the time horizon and the constant $c$ only depends on the Lipschitz structure, the span of the bias function, and the minimal action sub-optimality gap.

reinforcement-learning Reinforcement Learning (RL)

Minimal Exploration in Structured Stochastic Bandits

no code implementations NeurIPS 2017 Richard Combes, Stefan Magureanu, Alexandre Proutiere

This paper introduces and addresses a wide class of stochastic bandit problems where the function mapping the arm to the corresponding reward exhibits some known structural properties.

Thompson Sampling

Fast and Memory Optimal Low-Rank Matrix Approximation

no code implementations NeurIPS 2015 Se-Young Yun, Marc Lelarge, Alexandre Proutiere

This means that its average mean-square error converges to 0 as $m$ and $n$ grow large (i. e., $\|\hat{M}^{(k)}-M^{(k)} \|_F^2 = o(mn)$ with high probability, where $\hat{M}^{(k)}$ and $M^{(k)}$ denote the output of SLA and the optimal rank $k$ approximation of $M$, respectively).

Optimal Cluster Recovery in the Labeled Stochastic Block Model

no code implementations NeurIPS 2016 Se-Young Yun, Alexandre Proutiere

We find the set of parameters such that there exists a clustering algorithm with at most $s$ misclassified items in average under the general LSBM and for any $s=o(n)$, which solves one open problem raised in \cite{abbe2015community}.

Clustering Community Detection +1

Cluster-Aided Mobility Predictions

no code implementations12 Jul 2015 Jaeseong Jeong, Mathieu Leconte, Alexandre Proutiere

In this paper, we develop cluster-aided predictors that exploit past trajectories collected from all users to predict the next location of a given user.

Clustering

Streaming, Memory Limited Matrix Completion with Noise

no code implementations13 Apr 2015 Se-Young Yun, Marc Lelarge, Alexandre Proutiere

We propose a streaming algorithm which produces an estimate of the original matrix with a vanishing mean square error, uses memory space scaling linearly with the ambient dimension of the matrix, i. e. the memory required to store the output alone, and spends computations as much as the number of non-zero entries of the input matrix.

Matrix Completion

Combinatorial Bandits Revisited

1 code implementation NeurIPS 2015 Richard Combes, M. Sadegh Talebi, Alexandre Proutiere, Marc Lelarge

In the adversarial setting under bandit feedback, we propose \textsc{CombEXP}, an algorithm with the same regret scaling as state-of-the-art algorithms, but with lower computational complexity for some combinatorial problems.

Accurate Community Detection in the Stochastic Block Model via Spectral Algorithms

no code implementations23 Dec 2014 Se-Young Yun, Alexandre Proutiere

We consider the problem of community detection in the Stochastic Block Model with a finite number $K$ of communities of sizes linearly growing with the network size $n$.

Social and Information Networks Data Structures and Algorithms

Streaming, Memory Limited Algorithms for Community Detection

no code implementations NeurIPS 2014 Se-Young Yun, Marc Lelarge, Alexandre Proutiere

The first algorithm is {\it offline}, as it needs to store and keep the assignments of nodes to clusters, and requires a memory that scales linearly with the network size.

Clustering Community Detection

Unimodal Bandits without Smoothness

no code implementations28 Jun 2014 Richard Combes, Alexandre Proutiere

To our knowledge, the SP algorithm constitutes the first sequential arm selection rule that achieves a regret and optimization error scaling as $O(\sqrt{T})$ and $O(1/\sqrt{T})$, respectively, up to a logarithmic factor for non-smooth expected reward functions, as well as for smooth functions with unknown smoothness.

Unimodal Bandits: Regret Lower Bounds and Optimal Algorithms

no code implementations20 May 2014 Richard Combes, Alexandre Proutiere

We also provide a regret upper bound for OSUB in non-stationary environments where the expected rewards smoothly evolve over time.

Multi-Armed Bandits

Lipschitz Bandits: Regret Lower Bounds and Optimal Algorithms

no code implementations19 May 2014 Stefan Magureanu, Richard Combes, Alexandre Proutiere

For discrete Lipschitz bandits, we derive asymptotic problem specific lower bounds for the regret satisfied by any algorithm, and propose OSLB and CKL-UCB, two algorithms that efficiently exploit the Lipschitz structure of the problem.

Multi-Armed Bandits

Dynamic Rate and Channel Selection in Cognitive Radio Systems

no code implementations23 Feb 2014 Richard Combes, Alexandre Proutiere

In turn, the proposed algorithms optimally exploit the inherent structure of the throughput.

Two-Target Algorithms for Infinite-Armed Bandits with Bernoulli Rewards

no code implementations NeurIPS 2013 Thomas Bonald, Alexandre Proutiere

This two-target algorithm achieves a long-term average regret in $\sqrt{2n}$ for a large parameter $m$ and a known time horizon $n$.

Vocal Bursts Valence Prediction

Stochastic Online Shortest Path Routing: The Value of Feedback

no code implementations27 Sep 2013 M. Sadegh Talebi, Zhenhua Zou, Richard Combes, Alexandre Proutiere, Mikael Johansson

The parameters, and hence the optimal path, can only be estimated by routing packets through the network and observing the realized delays.

Spectrum Bandit Optimization

no code implementations27 Feb 2013 Marc Lelarge, Alexandre Proutiere, M. Sadegh Talebi

We consider the problem of allocating radio channels to links in a wireless network.

Cannot find the paper you are looking for? You can Submit a new open access paper.