Search Results for author: Alexandre Proutiere

Found 52 papers, 7 papers with code

Low-Rank Bandits via Tight Two-to-Infinity Singular Subspace Recovery

no code implementations • 24 Feb 2024 • Yassir Jedra, William Réveillard, Stefan Stojanovic, Alexandre Proutiere

For policy evaluation and best policy identification, we show that our algorithms are nearly minimax optimal.

Paper
Add Code

Best Arm Identification with Fixed Budget: A Large Deviation Perspective

1 code implementation • NeurIPS 2023 • Po-An Wang, Ruo-Chun Tzeng, Alexandre Proutiere

In particular, we present \sred (Continuous Rejects), a truly adaptive algorithm that can reject arms in {\it any} round based on the observed empirical gaps between the rewards of various arms.

Multi-Armed Bandits

Paper
Code

Sub-linear Regret in Adaptive Model Predictive Control

no code implementations • 7 Oct 2023 • Damianos Tranos, Alexandre Proutiere

Our result relies on a recently proposed exponential decay of sensitivity property and, to the best of our knowledge, is the first of its kind in this setting.

Model Predictive Control

Paper
Add Code

On Uniformly Optimal Algorithms for Best Arm Identification in Two-Armed Bandits with Fixed Budget

no code implementations • 23 Aug 2023 • Po-An Wang, Kaito Ariu, Alexandre Proutiere

We prove that there is no algorithm that (i) performs as well as the algorithm sampling each arm equally (this algorithm is referred to as the {\it uniform sampling} algorithm) on all instances, and that (ii) strictly outperforms this algorithm on at least one instance.

Paper
Add Code

Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model

no code implementations • 18 Jun 2023 • Kaito Ariu, Alexandre Proutiere, Se-Young Yun

To this end, we revisit instance-specific lower bounds on the expected number of misclassified items satisfied by any clustering algorithm.

Clustering Stochastic Block Model

Paper
Add Code

Conformal Off-Policy Evaluation in Markov Decision Processes

no code implementations • 5 Apr 2023 • Daniele Foffano, Alessio Russo, Alexandre Proutiere

Reinforcement Learning aims at identifying and evaluating efficient control policies from data.

Conformal Prediction Off-policy evaluation

Paper
Add Code

On the Sample Complexity of Representation Learning in Multi-task Bandits with Global and Local structure

1 code implementation • 28 Nov 2022 • Alessio Russo, Alexandre Proutiere

Arms consist of two components: one that is shared across tasks (that we call representation) and one that is task-specific (that we call predictor).

Representation Learning

Paper
Code

Self-Tuning Tube-based Model Predictive Control

no code implementations • 2 Oct 2022 • Damianos Tranos, Alessio Russo, Alexandre Proutiere

We present Self-Tuning Tube-based Model Predictive Control (STT-MPC), an adaptive robust control algorithm for uncertain linear systems with additive disturbances based on the least-squares estimator and polytopic tubes.

Model Predictive Control

Paper
Add Code

Tube-Based Zonotopic Data-Driven Predictive Control

1 code implementation • 7 Sep 2022 • Alessio Russo, Alexandre Proutiere

We present a novel tube-based data-driven predictive control method for linear systems affected by a bounded addictive disturbance.

Computational Efficiency

Paper
Code

Best Policy Identification in Linear MDPs

no code implementations • 11 Aug 2022 • Jerome Taupin, Yassir Jedra, Alexandre Proutiere

We investigate the problem of best policy identification in discounted linear Markov Decision Processes in the fixed confidence setting under a generative model.

Paper
Add Code

Measurement-based Admission Control in Sliced Networks: A Best Arm Identification Approach

no code implementations • 14 Apr 2022 • Simon Lindståhl, Alexandre Proutiere, Andreas Johnsson

The objective is to devise a joint measurement and decision strategy that returns a correct decision (e. g., the least loaded slice) with a certain level of confidence while minimizing the measurement cost (the number of measurements made before committing to the decision).

Paper
Add Code

Learning Optimal Antenna Tilt Control Policies: A Contextual Linear Bandit Approach

no code implementations • 6 Jan 2022 • Filippo Vannella, Alexandre Proutiere, Yassir Jedra, Jaeseong Jeong

In this paper, we devise algorithms learning optimal tilt control policies from existing data (in the so-called passive learning setting) or from data actively generated by the algorithms (the active learning setting).

Active Learning

Paper
Add Code

Fast Pure Exploration via Frank-Wolfe

no code implementations • NeurIPS 2021 • Po-An Wang, Ruo-Chun Tzeng, Alexandre Proutiere

For this problem, instance-specific lower bounds on the expected sample complexity reveal the optimal proportions of arm draws an Oracle algorithm would apply.

Paper
Add Code

Minimal Expected Regret in Linear Quadratic Control

no code implementations • 29 Sep 2021 • Yassir Jedra, Alexandre Proutiere

Quantifying the impact of such a constantly-varying control policy on the performance of these estimates and on the regret constitutes one of the technical challenges tackled in this paper.

Paper
Add Code

Distributed Zeroth-Order Optimization: Convergence Rates That Match Centralized Counterpart

no code implementations • 29 Sep 2021 • Deming Yuan, Lei Wang, Alexandre Proutiere, Guodong Shi

Zeroth-order optimization has become increasingly important in complex optimization and machine learning when cost functions are impossible to be described in closed analytical forms.

Paper
Add Code

Balancing detectability and performance of attacks on the control channel of Markov Decision Processes

1 code implementation • 15 Sep 2021 • Alessio Russo, Alexandre Proutiere

In such an attack, drawing inspiration from adversarial examples used in supervised learning, the amplitude of the adversarial perturbation is limited according to some norm, with the hope that this constraint will make the attack imperceptible.

Reinforcement Learning (RL)

Paper
Code

Online Learning of Optimally Diverse Rankings

no code implementations • 13 Sep 2021 • Stefan Magureanu, Alexandre Proutiere, Marcus Isaksson, Boxun Zhang

In absence of any contextual information about the query, one often has to adhere to the {\it diversity} principle, i. e., to return a list covering the various possible topics or meanings of the query.

Learning-To-Rank

Paper
Add Code

Regret Analysis in Deterministic Reinforcement Learning

no code implementations • 27 Jun 2021 • Damianos Tranos, Alexandre Proutiere

We consider Markov Decision Processes (MDPs) with deterministic transitions and study the problem of regret minimization, which is central to the analysis and design of optimal learning algorithms.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Navigating to the Best Policy in Markov Decision Processes

no code implementations • NeurIPS 2021 • Aymen Al Marjani, Aurélien Garivier, Alexandre Proutiere

We investigate the classical active pure exploration problem in Markov Decision Processes, where the agent sequentially selects actions and, from the resulting system trajectory, aims at identifying the best policy as fast as possible.

Paper
Add Code

Poisoning Attacks against Data-Driven Control Methods

no code implementations • 10 Mar 2021 • Alessio Russo, Alexandre Proutiere

This paper investigates poisoning attacks against data-driven control methods.

LEMMA

Paper
Add Code

Data-Driven Control and Data-Poisoning attacks in Buildings: the KTH Live-In Lab case study

no code implementations • 10 Mar 2021 • Alessio Russo, Marco Molinari, Alexandre Proutiere

This work investigates the feasibility of using input-output data-driven control techniques for building control and their susceptibility to data-poisoning techniques.

Data Poisoning

Paper
Add Code

Minimizing Information Leakage of Abrupt Changes in Stochastic Systems

1 code implementation • 2 Mar 2021 • Alessio Russo, Alexandre Proutiere

In contrast to previous work on privacy, we study the problem for an online sequence of data.

Paper
Code

Adaptive Sampling for Best Policy Identification in Markov Decision Processes

no code implementations • 28 Sep 2020 • Aymen Al Marjani, Alexandre Proutiere

We then provide a simple and tight upper bound of the sample complexity lower bound, whose corresponding nearly-optimal sample allocation becomes explicit.

Paper
Add Code

Optimal Best-arm Identification in Linear Bandits

no code implementations • NeurIPS 2020 • Yassir Jedra, Alexandre Proutiere

We study the problem of best-arm identification with fixed confidence in stochastic linear bandits.

Paper
Add Code

Off-policy Learning for Remote Electrical Tilt Optimization

no code implementations • 21 May 2020 • Filippo Vannella, Jaeseong Jeong, Alexandre Proutiere

In this paper, we circumvent these issues by learning an improved policy in an offline manner using existing data collected on real networks.

Paper
Add Code

Predictive Bandits

no code implementations • 2 Apr 2020 • Simon Lindståhl, Alexandre Proutiere, Andreas Johnsson

In each round, the decision maker first decides whether to gather information about the rewards of particular arms (so that their rewards in this round can be predicted).

Paper
Add Code

Finite-time Identification of Stable Linear Systems: Optimality of the Least-Squares Estimator

no code implementations • 17 Mar 2020 • Yassir Jedra, Alexandre Proutiere

We present a new finite-time analysis of the estimation error of the Ordinary Least Squares (OLS) estimator for stable linear time-invariant systems.

Paper
Add Code

Distributed Online Optimization with Long-Term Constraints

no code implementations • 20 Dec 2019 • Deming Yuan, Alexandre Proutiere, Guodong Shi

When the loss functions are strongly convex, we establish improved regret and constraint violation upper bounds in $\mathcal{O}(\log(T))$ and $\mathcal{O}(\sqrt{T\log(T)})$.

Paper
Add Code

Optimal Sampling and Clustering in the Stochastic Block Model

1 code implementation • NeurIPS 2019 • Se-Young Yun, Alexandre Proutiere

We derive information-theoretical upper bounds on the cluster recovery rate.

Clustering Stochastic Block Model

Paper
Code

Optimal Clustering from Noisy Binary Feedback

no code implementations • 14 Oct 2019 • Kaito Ariu, Jungseul Ok, Alexandre Proutiere, Se-Young Yun

The objective is to devise an algorithm with a minimal cluster recovery error rate.

Clustering Question Selection

Paper
Add Code

An Optimal Algorithm for Multiplayer Multi-Armed Bandits

no code implementations • 28 Sep 2019 • Alexandre Proutiere, Po-An Wang

We present DPE (Decentralized Parsimonious Exploration), a decentralized algorithm that achieves the same regret as that obtained by an optimal centralized algorithm.

Multi-Armed Bandits

Paper
Add Code

Optimal Attacks on Reinforcement Learning Policies

no code implementations • 31 Jul 2019 • Alessio Russo, Alexandre Proutiere

Finally, we show that from the main agent perspective, the system uncertainties and the attacker can be modeled as a Partially Observable Markov Decision Process.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

From self-tuning regulators to reinforcement learning and back again

no code implementations • 27 Jun 2019 • Nikolai Matni, Alexandre Proutiere, Anders Rantzer, Stephen Tu

Machine and reinforcement learning (RL) are increasingly being applied to plan and control the behavior of autonomous systems interacting with the physical world.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Sample Complexity Lower Bounds for Linear System Identification

no code implementations • 25 Mar 2019 • Yassir Jedra, Alexandre Proutiere

For controlled systems, our lower bounds are not as explicit as in the case of uncontrolled systems, but could well provide interesting insights into the design of control policy with minimal sample complexity.

valid

Paper
Add Code

Distributed Online Linear Regression

no code implementations • 13 Feb 2019 • Deming Yuan, Alexandre Proutiere, Guodong Shi

We propose simple and natural distributed regression algorithms, involving, at each node and in each round, a local gradient descent step and a communication and averaging step where nodes aim at aligning their predictors to those of their neighbors.

regression

Paper
Add Code

Learning to Personalize in Appearance-Based Gaze Tracking

no code implementations • 2 Jul 2018 • Erik Lindén, Jonas Sjöstrand, Alexandre Proutiere

We contribute to gaze tracking research by empirically showing that personal variations are well-modeled as a 3-dimensional latent parameter space for each eye.

Gaze Estimation

Paper
Add Code

Exploration in Structured Reinforcement Learning

no code implementations • NeurIPS 2018 • Jungseul Ok, Alexandre Proutiere, Damianos Tranos

For Lipschitz MDPs, the bounds are shown not to scale with the sizes $S$ and $A$ of the state and action spaces, i. e., they are smaller than $c\log T$ where $T$ is the time horizon and the constant $c$ only depends on the Lipschitz structure, the span of the bias function, and the minimal action sub-optimality gap.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Minimal Exploration in Structured Stochastic Bandits

no code implementations • NeurIPS 2017 • Richard Combes, Stefan Magureanu, Alexandre Proutiere

This paper introduces and addresses a wide class of stochastic bandit problems where the function mapping the arm to the corresponding reward exhibits some known structural properties.

Thompson Sampling

Paper
Add Code

Fast and Memory Optimal Low-Rank Matrix Approximation

no code implementations • NeurIPS 2015 • Se-Young Yun, Marc Lelarge, Alexandre Proutiere

This means that its average mean-square error converges to 0 as $m$ and $n$ grow large (i. e., $\|\hat{M}^{(k)}-M^{(k)} \|_F^2 = o(mn)$ with high probability, where $\hat{M}^{(k)}$ and $M^{(k)}$ denote the output of SLA and the optimal rank $k$ approximation of $M$, respectively).

Paper
Add Code

Optimal Cluster Recovery in the Labeled Stochastic Block Model

no code implementations • NeurIPS 2016 • Se-Young Yun, Alexandre Proutiere

We find the set of parameters such that there exists a clustering algorithm with at most $s$ misclassified items in average under the general LSBM and for any $s=o(n)$, which solves one open problem raised in \cite{abbe2015community}.

Clustering Community Detection +1

Paper
Add Code

Cluster-Aided Mobility Predictions

no code implementations • 12 Jul 2015 • Jaeseong Jeong, Mathieu Leconte, Alexandre Proutiere

In this paper, we develop cluster-aided predictors that exploit past trajectories collected from all users to predict the next location of a given user.

Clustering

Paper
Add Code

Streaming, Memory Limited Matrix Completion with Noise

no code implementations • 13 Apr 2015 • Se-Young Yun, Marc Lelarge, Alexandre Proutiere

We propose a streaming algorithm which produces an estimate of the original matrix with a vanishing mean square error, uses memory space scaling linearly with the ambient dimension of the matrix, i. e. the memory required to store the output alone, and spends computations as much as the number of non-zero entries of the input matrix.

Matrix Completion

Paper
Add Code

Combinatorial Bandits Revisited

1 code implementation • NeurIPS 2015 • Richard Combes, M. Sadegh Talebi, Alexandre Proutiere, Marc Lelarge

In the adversarial setting under bandit feedback, we propose \textsc{CombEXP}, an algorithm with the same regret scaling as state-of-the-art algorithms, but with lower computational complexity for some combinatorial problems.

Paper
Code

Accurate Community Detection in the Stochastic Block Model via Spectral Algorithms

no code implementations • 23 Dec 2014 • Se-Young Yun, Alexandre Proutiere

We consider the problem of community detection in the Stochastic Block Model with a finite number $K$ of communities of sizes linearly growing with the network size $n$.

Social and Information Networks Data Structures and Algorithms

Paper
Add Code

Streaming, Memory Limited Algorithms for Community Detection

no code implementations • NeurIPS 2014 • Se-Young Yun, Marc Lelarge, Alexandre Proutiere

The first algorithm is {\it offline}, as it needs to store and keep the assignments of nodes to clusters, and requires a memory that scales linearly with the network size.

Clustering Community Detection

Paper
Add Code

Unimodal Bandits without Smoothness

no code implementations • 28 Jun 2014 • Richard Combes, Alexandre Proutiere

To our knowledge, the SP algorithm constitutes the first sequential arm selection rule that achieves a regret and optimization error scaling as $O(\sqrt{T})$ and $O(1/\sqrt{T})$, respectively, up to a logarithmic factor for non-smooth expected reward functions, as well as for smooth functions with unknown smoothness.

Paper
Add Code

Unimodal Bandits: Regret Lower Bounds and Optimal Algorithms

no code implementations • 20 May 2014 • Richard Combes, Alexandre Proutiere

We also provide a regret upper bound for OSUB in non-stationary environments where the expected rewards smoothly evolve over time.

Multi-Armed Bandits

Paper
Add Code

Lipschitz Bandits: Regret Lower Bounds and Optimal Algorithms

no code implementations • 19 May 2014 • Stefan Magureanu, Richard Combes, Alexandre Proutiere

For discrete Lipschitz bandits, we derive asymptotic problem specific lower bounds for the regret satisfied by any algorithm, and propose OSLB and CKL-UCB, two algorithms that efficiently exploit the Lipschitz structure of the problem.

Multi-Armed Bandits

Paper
Add Code

Dynamic Rate and Channel Selection in Cognitive Radio Systems

no code implementations • 23 Feb 2014 • Richard Combes, Alexandre Proutiere

In turn, the proposed algorithms optimally exploit the inherent structure of the throughput.

Paper
Add Code

Two-Target Algorithms for Infinite-Armed Bandits with Bernoulli Rewards

no code implementations • NeurIPS 2013 • Thomas Bonald, Alexandre Proutiere

This two-target algorithm achieves a long-term average regret in $\sqrt{2n}$ for a large parameter $m$ and a known time horizon $n$.

Vocal Bursts Valence Prediction

Paper
Add Code

Stochastic Online Shortest Path Routing: The Value of Feedback

no code implementations • 27 Sep 2013 • M. Sadegh Talebi, Zhenhua Zou, Richard Combes, Alexandre Proutiere, Mikael Johansson

The parameters, and hence the optimal path, can only be estimated by routing packets through the network and observing the realized delays.

Paper
Add Code

Spectrum Bandit Optimization

no code implementations • 27 Feb 2013 • Marc Lelarge, Alexandre Proutiere, M. Sadegh Talebi

We consider the problem of allocating radio channels to links in a wireless network.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.