Search Results for author: David S. Leslie

Found 18 papers, 5 papers with code

Federated $\mathcal{X}$-armed Bandit with Flexible Personalisation

no code implementations11 Sep 2024 Ali Arabzadeh, James A. Grant, David S. Leslie

This paper introduces a novel approach to personalised federated learning within the $\mathcal{X}$-armed bandit framework, addressing the challenge of optimising both local and global objectives in a highly heterogeneous environment.

Federated Learning

FINN.no Slates Dataset: A new Sequential Dataset Logging Interactions, allViewed Items and Click Responses/No-Click for Recommender Systems Research

1 code implementation5 Nov 2021 Simen Eide, Arnoldo Frigessi, Helge Jenssen, David S. Leslie, Joakim Rishaug, Sofie Verrewaere

Although the usage of exposure data in recommender systems is growing, to our knowledge there is no open large-scale recommender systems dataset that includes the slates of items presented to the users at each interaction.

Recommendation Systems

Apple Tasting Revisited: Bayesian Approaches to Partially Monitored Online Binary Classification

no code implementations29 Sep 2021 James A. Grant, David S. Leslie

We consider a variant of online binary classification where a learner sequentially assigns labels ($0$ or $1$) to items with unknown true class.

Binary Classification Thompson Sampling

Decentralized Q-Learning in Zero-sum Markov Games

no code implementations NeurIPS 2021 Muhammed O. Sayin, Kaiqing Zhang, David S. Leslie, Tamer Basar, Asuman Ozdaglar

The key challenge in this decentralized setting is the non-stationarity of the environment from an agent's perspective, since both her own payoffs and the system evolution depend on the actions of other agents, and each agent adapts her policies simultaneously and independently.

Multi-agent Reinforcement Learning Q-Learning

Dynamic Slate Recommendation with Gated Recurrent Units and Thompson Sampling

2 code implementations30 Apr 2021 Simen Eide, David S. Leslie, Arnoldo Frigessi

We introduce a variational Bayesian Recurrent Neural Net recommender system that acts on time series of interactions between the internet platform and the user, and which scales to real world industrial situations.

Recommendation Systems Thompson Sampling +1

GIBBON: General-purpose Information-Based Bayesian OptimisatioN

no code implementations5 Feb 2021 Henry B. Moss, David S. Leslie, Javier Gonzalez, Paul Rayson

This paper describes a general-purpose extension of max-value entropy search, a popular approach for Bayesian Optimisation (BO).

Bayesian Optimisation Point Processes

BOSS: Bayesian Optimization over String Spaces

1 code implementation NeurIPS 2020 Henry B. Moss, Daniel Beck, Javier Gonzalez, David S. Leslie, Paul Rayson

This article develops a Bayesian optimization (BO) method which acts directly over raw strings, proposing the first uses of string kernels and genetic algorithms within BO loops.

Bayesian Optimization

Learning to Rank under Multinomial Logit Choice

no code implementations7 Sep 2020 James A. Grant, David S. Leslie

The learning to rank (LTR) framework models this problem as a sequential problem of selecting lists of content and observing where users decide to click.

Learning-To-Rank Position

BOSH: Bayesian Optimization by Sampling Hierarchically

no code implementations2 Jul 2020 Henry B. Moss, David S. Leslie, Paul Rayson

Deployments of Bayesian Optimization (BO) for functions with stochastic evaluations, such as parameter tuning via cross validation and simulation optimization, typically optimize an average of a fixed set of noisy realizations of the objective function.

Bayesian Optimization reinforcement-learning +1

MUMBO: MUlti-task Max-value Bayesian Optimization

no code implementations22 Jun 2020 Henry B. Moss, David S. Leslie, Paul Rayson

MUMBO is scalable and efficient, allowing multi-task Bayesian optimization to be deployed in problems with rich parameter and fidelity spaces.

Bayesian Optimization

FIESTA: Fast IdEntification of State-of-The-Art models using adaptive bandit algorithms

1 code implementation ACL 2019 Henry B. Moss, Andrew Moore, David S. Leslie, Paul Rayson

We present FIESTA, a model selection approach that significantly reduces the computational resources required to reliably identify state-of-the-art performance from large collections of candidate models.

Model Selection Sentiment Analysis

Adaptive Sensor Placement for Continuous Spaces

no code implementations16 May 2019 James A. Grant, Alexis Boukouvalas, Ryan-Rhys Griffiths, David S. Leslie, Sattar Vakili, Enrique Munoz de Cote

We consider the problem of adaptively placing sensors along an interval to detect stochastically-generated events.

Thompson Sampling

Adaptive Policies for Perimeter Surveillance Problems

no code implementations4 Oct 2018 James A. Grant, David S. Leslie, Kevin Glazebrook, Roberto Szechtman, Adam N. Letchford

Maximising the detection of intrusions is a fundamental and often critical aim of perimeter surveillance.

Bandit learning in concave $N$-person games

no code implementations3 Oct 2018 Mario Bravo, David S. Leslie, Panayotis Mertikopoulos

This paper examines the long-run behavior of learning with bandit feedback in non-cooperative concave games.

Stochastic Optimization

Using J-K fold Cross Validation to Reduce Variance When Tuning NLP Models

1 code implementation19 Jun 2018 Henry B. Moss, David S. Leslie, Paul Rayson

K-fold cross validation (CV) is a popular method for estimating the true performance of machine learning models, allowing model selection and parameter tuning.

Document Classification General Classification +4

Combinatorial Multi-Armed Bandits with Filtered Feedback

no code implementations26 May 2017 James A. Grant, David S. Leslie, Kevin Glazebrook, Roberto Szechtman

Motivated by problems in search and detection we present a solution to a Combinatorial Multi-Armed Bandit (CMAB) problem with both heavy-tailed reward distributions and a new class of feedback, filtered semibandit feedback.

Multi-Armed Bandits

Game-theoretical control with continuous action sets

no code implementations1 Dec 2014 Steven Perkins, Panayotis Mertikopoulos, David S. Leslie

To do so, we extend the theory of finite-dimensional two-timescale stochastic approximation to an infinite-dimensional, Banach space setting, and we prove that the continuous dynamics of the process converge to equilibrium in the case of potential games.

Reinforcement Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.