1 code implementation • 7 Dec 2022 • Alberto Maria Metelli, Francesco Trovò, Matteo Pirola, Marcello Restelli
This paper is in the field of stochastic Multi-Armed Bandits (MABs), i. e., those sequential selection techniques able to learn online using only the feedback given by the chosen option (a. k. a.
no code implementations • 17 Nov 2022 • Marco Mussi, Gianmarco Genalti, Alessandro Nuara, Francesco Trovò, Marcello Restelli, Nicola Gatti
We ran a real-world 4-month-long A/B testing experiment in collaboration with an Italian e-commerce company, in which our algorithm PVD-B-corresponding to A configuration-has been compared with human pricing specialists-corresponding to B configuration.
no code implementations • 1 Jun 2022 • Giulia Romano, Andrea Agostini, Francesco Trovò, Nicola Gatti, Marcello Restelli
We provide two algorithms to address TP-MAB problems, namely, TP-UCB-FR and TP-UCB-EW, which exploit the partial information disclosed by the reward collected over time.
1 code implementation • 20 May 2022 • Marco Mussi, Davide Lombarda, Alberto Maria Metelli, Francesco Trovò, Marcello Restelli
In this work, we propose a general and flexible framework, namely ARLO: Automated Reinforcement Learning Optimizer, to construct automated pipelines for AutoRL.
no code implementations • 18 Jan 2022 • Matteo Castiglioni, Alessandro Nuara, Giulia Romano, Giorgio Spadaro, Francesco Trovò, Nicola Gatti
More interestingly, we provide an algorithm, namely GCB_{safe}(\psi,\phi), guaranteeing both sublinear pseudo-regret and safety w. h. p.
no code implementations • 30 Sep 2021 • Marco Gabrielli, Francesco Trovò, Manuela Antonelli
Instead, in such applications, a set of options is presented sequentially to the learner within a time span, and this process is repeated throughout a time horizon.
no code implementations • 3 Mar 2020 • Alessandro Nuara, Francesco Trovò, Nicola Gatti, Marcello Restelli
We experimentally evaluate our algorithms with synthetic settings generated from real data from Yahoo!, and we present the results of the adoption of our algorithms in a real-world application with a daily average spent of 1, 000 Euros for more than one year.
no code implementations • 18 Nov 2019 • Alberto Marchesi, Francesco Trovò, Nicola Gatti
As a result, solving these games begets the challenge of designing learning algorithms that can find (approximate) equilibria with high confidence, using as few simulator queries as possible.
no code implementations • 17 Nov 2016 • Stefano Paladino, Francesco Trovò, Marcello Restelli, Nicola Gatti
We study, to the best of our knowledge, the first Bayesian algorithm for unimodal Multi-Armed Bandit (MAB) problems with graph structure.
no code implementations • 29 Sep 2016 • Giuseppe De Nittis, Francesco Trovò
The present survey aims at presenting the current machine learning techniques employed in security games domains.