Search Results for author: Nicolò Cesa-Bianchi

Found 60 papers, 2 papers with code

Information Capacity Regret Bounds for Bandits with Mediator Feedback

no code implementations15 Feb 2024 Khaled Eldowa, Nicolò Cesa-Bianchi, Alberto Maria Metelli, Marcello Restelli

For a selection of policy set families, we prove nearly-matching lower bounds, scaling similarly with the capacity.

Sum-max Submodular Bandits

no code implementations10 Nov 2023 Stephen Pasteris, Alberto Rumi, Fabio Vitale, Nicolò Cesa-Bianchi

Many online decision-making problems correspond to maximizing a sequence of submodular functions.

Decision Making

Multitask Online Learning: Listen to the Neighborhood Buzz

no code implementations26 Oct 2023 Juliette Achddou, Nicolò Cesa-Bianchi, Pierre Laforgue

We study multitask online learning in a setting where agents can only exchange information with their neighbors on an arbitrary communication network.

High-Probability Risk Bounds via Sequential Predictors

no code implementations15 Aug 2023 Dirk van der Hoeven, Nikita Zhivotovskiy, Nicolò Cesa-Bianchi

Online learning methods yield sequential regret bounds under minimal assumptions and provide in-expectation risk bounds for statistical learning.

Density Estimation regression

The Role of Transparency in Repeated First-Price Auctions with Unknown Valuations

no code implementations14 Jul 2023 Nicolò Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni, Federico Fusco, Stefano Leonardi

We study the problem of regret minimization for a single bidder in a sequence of first-price auctions where the bidder discovers the item's value only if the auction is won.

Delayed Bandits: When Do Intermediate Observations Help?

no code implementations30 May 2023 Emmanuel Esposito, Saeed Masoudian, Hao Qiu, Dirk van der Hoeven, Nicolò Cesa-Bianchi, Yevgeny Seldin

However, if the mapping of states to losses is stochastic, we show that the regret grows at a rate of $\sqrt{\big(K+\min\{|\mathcal{S}|, d\}\big)T}$ (within log factors), implying that if the number $|\mathcal{S}|$ of states is smaller than the delay, then intermediate observations help.

Information-Theoretic Regret Bounds for Bandits with Fixed Expert Advice

no code implementations14 Mar 2023 Khaled Eldowa, Nicolò Cesa-Bianchi, Alberto Maria Metelli, Marcello Restelli

We investigate the problem of bandits with expert advice when the experts are fixed and known distributions over the actions.

Repeated Bilateral Trade Against a Smoothed Adversary

no code implementations21 Feb 2023 Nicolò Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni, Federico Fusco, Stefano Leonardi

We provide a complete characterization of the regret regimes for fixed-price mechanisms under different feedback models in the two cases where the learner can post either the same or different prices to buyers and sellers.

Linear Bandits with Memory: from Rotting to Rising

no code implementations16 Feb 2023 Giulia Clerici, Pierre Laforgue, Nicolò Cesa-Bianchi

By choosing the cycle length so as to trade-off approximation and estimation errors, we then prove a bound of order $\sqrt{d}\,(m+1)^{\frac{1}{2}+\max\{\gamma, 0\}}\, T^{3/4}$ (ignoring log factors) on the regret against the optimal sequence of actions, where $T$ is the horizon and $d$ is the dimension of the linear action space.

Decision Making Model Selection

Learning on the Edge: Online Learning with Stochastic Feedback Graphs

no code implementations9 Oct 2022 Emmanuel Esposito, Federico Fusco, Dirk van der Hoeven, Nicolò Cesa-Bianchi

The framework of feedback graphs is a generalization of sequential decision-making with bandit or full information feedback.

Decision Making

Active Learning of Classifiers with Label and Seed Queries

no code implementations8 Sep 2022 Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice, Maximilian Thiessen

In this work we show that, by carefully combining the two types of queries, a binary classifier can be learned in time $\operatorname{poly}(n+m)$ using only $O(m^2 \log n)$ label queries and $O\big(m \log \frac{m}{\gamma}\big)$ seed queries; the result extends to $k$-class classifiers at the price of a $k! k^2$ multiplicative overhead.

Active Learning

Online Learning in Supply-Chain Games

no code implementations8 Jul 2022 Nicolò Cesa-Bianchi, Tommaso Cesari, Takayuki Osogami, Marco Scarsini, Segev Wasserkrug

We study a repeated game between a supplier and a retailer who want to maximize their respective profits without full knowledge of the problem parameters.

A Regret-Variance Trade-Off in Online Learning

no code implementations6 Jun 2022 Dirk van der Hoeven, Nikita Zhivotovskiy, Nicolò Cesa-Bianchi

We prove that a variant of EWA either achieves a negative regret (i. e., the algorithm outperforms the best expert), or guarantees a $O(\log K)$ bound on both variance and regret.

Model Selection

A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with Feedback Graphs

no code implementations1 Jun 2022 Chloé Rouyer, Dirk van der Hoeven, Nicolò Cesa-Bianchi, Yevgeny Seldin

The algorithm combines ideas from the EXP3++ algorithm for stochastic and adversarial bandits and the EXP3. G algorithm for feedback graphs with a novel exploration scheme.

Decision Making

AdaTask: Adaptive Multitask Online Learning

no code implementations31 May 2022 Pierre Laforgue, Andrea Della Vecchia, Nicolò Cesa-Bianchi, Lorenzo Rosasco

We introduce and analyze AdaTask, a multitask online learning algorithm that adapts to the unknown structure of the tasks.

Nonstochastic Bandits with Composite Anonymous Feedback

no code implementations6 Dec 2021 Nicolò Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni, Claudio Gentile, Yishay Mansour

We investigate a nonstochastic bandit setting in which the loss of an action is not immediately charged to the player, but rather spread over the subsequent rounds in an adversarial way.

Nonstochastic Bandits and Experts with Arm-Dependent Delays

no code implementations2 Nov 2021 Dirk van der Hoeven, Nicolò Cesa-Bianchi

We study nonstochastic bandits and experts in a delayed setting where delays depend on both time and arms.

A Last Switch Dependent Analysis of Satiation and Seasonality in Bandits

1 code implementation22 Oct 2021 Pierre Laforgue, Giulia Clerici, Nicolò Cesa-Bianchi, Ran Gilad-Bachrach

Motivated by the fact that humans like some level of unpredictability or novelty, and might therefore get quickly bored when interacting with a stationary policy, we introduce a novel non-stationary bandit problem, where the expected reward of an arm is fully determined by the time elapsed since the arm last took part in a switch of actions.

Bilateral Trade: A Regret Minimization Perspective

no code implementations8 Sep 2021 Nicolò Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni, Federico Fusco, Stefano Leonardi

In this paper, we cast the bilateral trade problem in a regret minimization framework over $T$ rounds of seller/buyer interactions, with no prior knowledge on their private valuations.

Cooperative Online Learning with Feedback Graphs

no code implementations9 Jun 2021 Nicolò Cesa-Bianchi, Tommaso R. Cesari, Riccardo Della Vecchia

We study the interplay between feedback and communication in a cooperative online learning setting where a network of agents solves a task in which the learners' feedback is determined by an arbitrary graph.

On Margin-Based Cluster Recovery with Oracle Queries

no code implementations NeurIPS 2021 Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice

We study an active cluster recovery problem where, given a set of $n$ points and an oracle answering queries like "are these two points in the same cluster?

Beyond Bandit Feedback in Online Multiclass Classification

no code implementations NeurIPS 2021 Dirk van der Hoeven, Federico Fusco, Nicolò Cesa-Bianchi

We study the problem of online multiclass classification in a setting where the learner's feedback is determined by an arbitrary directed graph.

Classification

Multitask Online Mirror Descent

no code implementations NeurIPS 2021 Nicolò Cesa-Bianchi, Pierre Laforgue, Andrea Paudice, Massimiliano Pontil

We introduce and analyze MT-OMD, a multitask generalization of Online Mirror Descent (OMD) which operates by sharing updates between tasks.

Finding Stable Matchings in PhD Markets with Consistent Preferences and Cooperative Partners

no code implementations23 Feb 2021 Maximilian Mordig, Riccardo Della Vecchia, Nicolò Cesa-Bianchi, Bernhard Schölkopf

Our setting is motivated by a PhD market of students, advisors, and co-advisors, and can be generalized to supply chain networks viewed as $n$-sided markets.

Computer Science and Game Theory Theoretical Economics Combinatorics

An Algorithm for Stochastic and Adversarial Bandits with Switching Costs

no code implementations19 Feb 2021 Chloé Rouyer, Yevgeny Seldin, Nicolò Cesa-Bianchi

In the stochastically constrained adversarial regime, which includes the stochastic regime as a special case, it achieves a regret bound of $O\left(\big((\lambda K)^{2/3} T^{1/3} + \ln T\big)\sum_{i \neq i^*} \Delta_i^{-1}\right)$, where $\Delta_i$ are the suboptimality gaps and $i^*$ is a unique optimal arm.

A Regret Analysis of Bilateral Trade

no code implementations16 Feb 2021 Nicolò Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni, Federico Fusco, Stefano Leonardi

Despite the simplicity of this problem, a classical result by Myerson and Satterthwaite (1983) affirms the impossibility of designing a mechanism which is simultaneously efficient, incentive compatible, individually rational, and budget balanced.

Exact Recovery of Clusters in Finite Metric Spaces Using Oracle Queries

no code implementations31 Jan 2021 Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice

Previous results show that clusters in Euclidean spaces that are convex and separated with a margin can be reconstructed exactly using only $O(\log n)$ same-cluster queries, where $n$ is the number of input points.

Exact Recovery of Mangled Clusters with Same-Cluster Queries

no code implementations NeurIPS 2020 Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice

Given a finite set of input points, and an oracle revealing whether any two points lie in the same cluster, our goal is to recover all clusters exactly using as few queries as possible.

Clustering

Locally-Adaptive Nonparametric Online Learning

no code implementations NeurIPS 2020 Ilja Kuzborskij, Nicolò Cesa-Bianchi

When competing against "simple" locality profiles, our technique delivers regret bounds that are significantly better than those proven using the previous approach.

Stochastic Bandits with Delay-Dependent Payoffs

no code implementations7 Oct 2019 Leonardo Cella, Nicolò Cesa-Bianchi

Motivated by recommendation problems in music streaming platforms, we propose a nonstationary stochastic bandit model in which the expected reward of an arm depends on the number of rounds that have passed since the arm was last pulled.

Nonstochastic Multiarmed Bandits with Unrestricted Delays

no code implementations NeurIPS 2019 Tobias Sommer Thune, Nicolò Cesa-Bianchi, Yevgeny Seldin

We then introduce a new algorithm that lifts the requirement of bounded delays by using a wrapper that skips rounds with excessively large delays.

ROI Maximization in Stochastic Online Decision-Making

no code implementations NeurIPS 2021 Nicolò Cesa-Bianchi, Tommaso Cesari, Yishay Mansour, Vianney Perchet

We introduce a novel theoretical framework for Return On Investment (ROI) maximization in repeated decision-making.

Decision Making

Correlation Clustering with Adaptive Similarity Queries

1 code implementation NeurIPS 2019 Marco Bressan, Nicolò Cesa-Bianchi, Andrea Paudice, Fabio Vitale

In this work we investigate correlation clustering as an active learning problem: each similarity score can be learned by making a query, and the goal is to minimise both the disagreements and the total number of queries.

Active Learning Clustering

Distribution-Dependent Analysis of Gibbs-ERM Principle

no code implementations5 Feb 2019 Ilja Kuzborskij, Nicolò Cesa-Bianchi, Csaba Szepesvári

This is a well-established notion of effective dimension appearing in several previous works, including the analyses of SGD and ridge regression, but ours is the first work that brings this dimension to the analysis of learning using Gibbs densities.

Stochastic Optimization

Cooperative Online Learning: Keeping your Neighbors Updated

no code implementations23 Jan 2019 Nicolò Cesa-Bianchi, Tommaso R. Cesari, Claire Monteleoni

However, when agents can choose to ignore some of their neighbors based on the knowledge of the network structure, we prove a $O(\sqrt{\overline{\chi} T})$ sublinear regret bound, where $\overline{\chi} \ge \alpha$ is the clique-covering number of the network.

Efficient Linear Bandits through Matrix Sketching

no code implementations28 Sep 2018 Ilja Kuzborskij, Leonardo Cella, Nicolò Cesa-Bianchi

More precisely, we show that a sketch of size $m$ allows a $\mathcal{O}(md)$ update time for both algorithms, as opposed to $\Omega(d^2)$ required by their non-sketched versions in general (where $d$ is the dimension of context vectors).

Thompson Sampling

Dynamic Pricing with Finitely Many Unknown Valuations

no code implementations9 Jul 2018 Nicolò Cesa-Bianchi, Tommaso Cesari, Vianney Perchet

When $K=2$ in the distribution-dependent case, the hardness of our setting reduces to that of a stochastic $2$-armed bandit: we prove that an upper bound of order $(\log T)/\Delta$ (up to $\log\log$ factors) on the regret can be achieved with no information on the demand curve.

Positive and Unlabeled Learning through Negative Selection and Imbalance-aware Classification

no code implementations18 May 2018 Marco Frasca, Nicolò Cesa-Bianchi

Motivated by applications in protein function prediction, we consider a challenging supervised classification setting in which positive labels are scarce and there are no explicit negative labels.

Active Learning General Classification +1

Boltzmann Exploration Done Right

no code implementations NeurIPS 2017 Nicolò Cesa-Bianchi, Claudio Gentile, Gábor Lugosi, Gergely Neu

Boltzmann exploration is a classic strategy for sequential decision-making under uncertainty, and is one of the most standard tools in Reinforcement Learning (RL).

Decision Making Decision Making Under Uncertainty +2

Nonparametric Online Regression while Learning the Metric

no code implementations NeurIPS 2017 Ilja Kuzborskij, Nicolò Cesa-Bianchi

We study algorithms for online nonparametric regression that learn the directions along which the regression function is smoother.

regression

Bandit Regret Scaling with the Effective Loss Range

no code implementations15 May 2017 Nicolò Cesa-Bianchi, Ohad Shamir

We study how the regret guarantees of nonstochastic multi-armed bandits can be improved, if the effective range of the losses in each round is small (e. g. the maximal difference between two losses in a given round).

Multi-Armed Bandits

Algorithmic Chaining and the Role of Partial Feedback in Online Nonparametric Learning

no code implementations27 Feb 2017 Nicolò Cesa-Bianchi, Pierre Gaillard, Claudio Gentile, Sébastien Gerchinovitz

We investigate contextual online learning with nonparametric (Lipschitz) comparison classes under different assumptions on losses and feedback information.

On the Troll-Trust Model for Edge Sign Prediction in Social Networks

no code implementations1 Jun 2016 Géraud Le Falher, Nicolò Cesa-Bianchi, Claudio Gentile, Fabio Vitale

In the problem of edge sign prediction, we are given a directed graph (representing a social network), and our task is to predict the binary labels of the edges (i. e., the positive or negative nature of the social relationships).

Active Learning for Online Recognition of Human Activities from Streaming Videos

no code implementations11 Apr 2016 Rocco De Rosa, Ilaria Gori, Fabio Cuzzolin, Barbara Caputo, Nicolò Cesa-Bianchi

Recognising human activities from streaming videos poses unique challenges to learning algorithms: predictive models need to be scalable, incrementally trainable, and must remain bounded in size even when the data stream is arbitrarily long.

Active Learning

The ABACOC Algorithm: a Novel Approach for Nonparametric Classification of Data Streams

no code implementations20 Aug 2015 Rocco De Rosa, Francesco Orabona, Nicolò Cesa-Bianchi

Stream mining poses unique challenges to machine learning: predictive models are required to be scalable, incrementally trainable, must remain bounded in size (even when the data stream is arbitrarily long), and be nonparametric in order to achieve high accuracy even in complex and dynamic environments.

General Classification

Online Learning with Feedback Graphs: Beyond Bandits

no code implementations26 Feb 2015 Noga Alon, Nicolò Cesa-Bianchi, Ofer Dekel, Tomer Koren

We study a general class of online learning problems where the feedback is specified by a graph.

On the Complexity of Learning with Kernels

no code implementations5 Nov 2014 Nicolò Cesa-Bianchi, Yishay Mansour, Ohad Shamir

In this paper, we study lower bounds on the error attainable by such methods as a function of the number of entries observed in the kernel matrix or the rank of an approximate kernel matrix.

Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback

no code implementations30 Sep 2014 Noga Alon, Nicolò Cesa-Bianchi, Claudio Gentile, Shie Mannor, Yishay Mansour, Ohad Shamir

This naturally models several situations where the losses of different actions are related, and knowing the loss of one action provides information on the loss of other actions.

Multi-Armed Bandits

Online Learning with Costly Features and Labels

no code implementations NeurIPS 2013 Nicolò Cesa-Bianchi, Ofer Dekel, Ohad Shamir

In particular, we show that with switching costs, the attainable rate with bandit feedback is $T^{2/3}$.

A Gang of Bandits

no code implementations NeurIPS 2013 Nicolò Cesa-Bianchi, Claudio Gentile, Giovanni Zappella

Multi-armed bandit problems are receiving a great deal of attention because they adequately formalize the exploration-exploitation trade-offs arising in several industrially relevant applications, such as online advertisement and, more generally, recommendation systems.

Clustering Multi-Armed Bandits +1

A Generalized Online Mirror Descent with Applications to Classification and Regression

no code implementations10 Apr 2013 Francesco Orabona, Koby Crammer, Nicolò Cesa-Bianchi

A unifying perspective on the design and the analysis of online algorithms is provided by online mirror descent, a general prediction strategy from which most first-order algorithms can be obtained as special cases.

General Classification regression

Mirror Descent Meets Fixed Share (and feels no regret)

no code implementations NeurIPS 2012 Nicolò Cesa-Bianchi, Pierre Gaillard, Gabor Lugosi, Gilles Stoltz

Mirror descent with an entropic regularizer is known to achieve shifting regret bounds that are logarithmic in the dimension.

A Linear Time Active Learning Algorithm for Link Classification

no code implementations NeurIPS 2012 Nicolò Cesa-Bianchi, Claudio Gentile, Fabio Vitale, Giovanni Zappella

We provide a theoretical analysis within this model, showing that we can achieve an optimal (to whithin a constant factor) number of mistakes on any graph $G = (V, E)$ such that $|E|$ is at least order of $|V|^{3/2}$ by querying at most order of $|V|^{3/2}$ edge labels.

Active Learning Classification +2

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems

no code implementations25 Apr 2012 Sébastien Bubeck, Nicolò Cesa-Bianchi

Multi-armed bandit problems are the most basic examples of sequential decision problems with an exploration-exploitation trade-off.

See the Tree Through the Lines: The Shazoo Algorithm

no code implementations NeurIPS 2011 Fabio Vitale, Nicolò Cesa-Bianchi, Claudio Gentile, Giovanni Zappella

Although it is known how to predict the nodes of an unweighted tree in a nearly optimal way, in the weighted case a fully satisfactory algorithm is not available yet.

Efficient Transductive Online Learning via Randomized Rounding

no code implementations13 Jun 2011 Nicolò Cesa-Bianchi, Ohad Shamir

Most traditional online learning algorithms are based on variants of mirror descent or follow-the-leader.

Collaborative Filtering Open-Ended Question Answering

Linear Classification and Selective Sampling Under Low Noise Conditions

no code implementations NeurIPS 2008 Giovanni Cavallanti, Nicolò Cesa-Bianchi, Claudio Gentile

Using the so-called Tsybakov low noise condition to parametrize the instance distribution, we show bounds on the convergence rate to the Bayes risk of both the fully supervised and the selective sampling versions of the basic algorithm.

Classification General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.