Search Results for author: Igor Colin

Found 13 papers, 0 papers with code

Differentially Private Model-Based Offline Reinforcement Learning

no code implementations8 Feb 2024 Alexandre Rio, Merwan Barlier, Igor Colin, Albert Thomas

We address offline reinforcement learning with privacy guarantees, where the goal is to train a policy that is differentially private with respect to individual trajectories in the dataset.


Clustered Multi-Agent Linear Bandits

no code implementations15 Sep 2023 Hamza Cherkaoui, Merwan Barlier, Igor Colin

We address in this paper a particular instance of the multi-agent linear stochastic bandit problem, called clustered multi-agent linear bandits.


Price of Safety in Linear Best Arm Identification

no code implementations15 Sep 2023 Xuedong Shang, Igor Colin, Merwan Barlier, Hamza Cherkaoui

We introduce the safe best-arm identification framework with linear feedback, where the agent is subject to some stage-wise safety constraint that linearly depends on an unknown parameter vector.

An $α$-No-Regret Algorithm For Graphical Bilinear Bandits

no code implementations1 Jun 2022 Geovani Rizk, Igor Colin, Albert Thomas, Rida Laraki, Yann Chevaleyre

We propose the first regret-based approach to the Graphical Bilinear Bandits problem, where $n$ agents in a graph play a stochastic bilinear bandit game with each of their neighbors.

Refined bounds for randomized experimental design

no code implementations22 Dec 2020 Geovani Rizk, Igor Colin, Albert Thomas, Moez Draief

Experimental design is an approach for selecting samples among a given set so as to obtain the best estimator for a given criterion.

Experimental Design

Best Arm Identification in Graphical Bilinear Bandits

no code implementations14 Dec 2020 Geovani Rizk, Albert Thomas, Igor Colin, Rida Laraki, Yann Chevaleyre

We study the best arm identification problem in which the learner wants to find the graph allocation maximizing the sum of the bilinear rewards.

A Simple and Efficient Smoothing Method for Faster Optimization and Local Exploration

no code implementations NeurIPS 2020 Kevin Scaman, Ludovic Dos Santos, Merwan Barlier, Igor Colin

This novel smoothing method is then used to improve first-order non-smooth optimization (both convex and non-convex) by allowing for a local exploration of the search space.

Theoretical Limits of Pipeline Parallel Optimization and Application to Distributed Deep Learning

no code implementations NeurIPS 2019 Igor Colin, Ludovic Dos Santos, Kevin Scaman

For smooth convex and non-convex objective functions, we provide matching lower and upper complexity bounds and show that a naive pipeline parallelization of Nesterov's accelerated gradient descent is optimal.

Parallel Contextual Bandits in Wireless Handover Optimization

no code implementations21 Jan 2019 Igor Colin, Albert Thomas, Moez Draief

As cellular networks become denser, a scalable and dynamic tuning of wireless base station parameters can only be achieved through automated optimization.

Multi-Armed Bandits Thompson Sampling

Decentralized Topic Modelling with Latent Dirichlet Allocation

no code implementations5 Oct 2016 Igor Colin, Christophe Dupuy

Privacy preserving networks can be modelled as decentralized networks (e. g., sensors, connected objects, smartphones), where communication between nodes of the network is not controlled by an all-knowing, central node.

Privacy Preserving Topic Models

Extending Gossip Algorithms to Distributed Estimation of U-Statistics

no code implementations NeurIPS 2015 Igor Colin, Aurélien Bellet, Joseph Salmon, Stéphan Clémençon

Efficient and robust algorithms for decentralized estimation in networks are essential to many distributed systems.

Scaling-up Empirical Risk Minimization: Optimization of Incomplete U-statistics

no code implementations12 Jan 2015 Stéphan Clémençon, Aurélien Bellet, Igor Colin

In a wide range of statistical learning problems such as ranking, clustering or metric learning among others, the risk is accurately estimated by $U$-statistics of degree $d\geq 1$, i. e. functionals of the training data with low variance that take the form of averages over $k$-tuples.

Clustering Metric Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.