Search Results for author: David S. Leslie

Found 17 papers, 5 papers with code

FINN.no Slates Dataset: A new Sequential Dataset Logging Interactions, allViewed Items and Click Responses/No-Click for Recommender Systems Research

1 code implementation • 5 Nov 2021 • Simen Eide, Arnoldo Frigessi, Helge Jenssen, David S. Leslie, Joakim Rishaug, Sofie Verrewaere

Although the usage of exposure data in recommender systems is growing, to our knowledge there is no open large-scale recommender systems dataset that includes the slates of items presented to the users at each interaction.

Recommendation Systems

Paper
Code

Apple Tasting Revisited: Bayesian Approaches to Partially Monitored Online Binary Classification

no code implementations • 29 Sep 2021 • James A. Grant, David S. Leslie

We consider a variant of online binary classification where a learner sequentially assigns labels ($0$ or $1$) to items with unknown true class.

Binary Classification Thompson Sampling

Paper
Add Code

Decentralized Q-Learning in Zero-sum Markov Games

no code implementations • NeurIPS 2021 • Muhammed O. Sayin, Kaiqing Zhang, David S. Leslie, Tamer Basar, Asuman Ozdaglar

The key challenge in this decentralized setting is the non-stationarity of the environment from an agent's perspective, since both her own payoffs and the system evolution depend on the actions of other agents, and each agent adapts her policies simultaneously and independently.

Multi-agent Reinforcement Learning Q-Learning

Paper
Add Code

Dynamic Slate Recommendation with Gated Recurrent Units and Thompson Sampling

2 code implementations • 30 Apr 2021 • Simen Eide, David S. Leslie, Arnoldo Frigessi

We introduce a variational Bayesian Recurrent Neural Net recommender system that acts on time series of interactions between the internet platform and the user, and which scales to real world industrial situations.

Recommendation Systems Thompson Sampling +1

Paper
Code

GIBBON: General-purpose Information-Based Bayesian OptimisatioN

no code implementations • 5 Feb 2021 • Henry B. Moss, David S. Leslie, Javier Gonzalez, Paul Rayson

This paper describes a general-purpose extension of max-value entropy search, a popular approach for Bayesian Optimisation (BO).

Bayesian Optimisation Point Processes

Paper
Add Code

BOSS: Bayesian Optimization over String Spaces

1 code implementation • NeurIPS 2020 • Henry B. Moss, Daniel Beck, Javier Gonzalez, David S. Leslie, Paul Rayson

This article develops a Bayesian optimization (BO) method which acts directly over raw strings, proposing the first uses of string kernels and genetic algorithms within BO loops.

Bayesian Optimization

Paper
Code

Learning to Rank under Multinomial Logit Choice

no code implementations • 7 Sep 2020 • James A. Grant, David S. Leslie

The learning to rank (LTR) framework models this problem as a sequential problem of selecting lists of content and observing where users decide to click.

Learning-To-Rank Position

Paper
Add Code

BOSH: Bayesian Optimization by Sampling Hierarchically

no code implementations • 2 Jul 2020 • Henry B. Moss, David S. Leslie, Paul Rayson

Deployments of Bayesian Optimization (BO) for functions with stochastic evaluations, such as parameter tuning via cross validation and simulation optimization, typically optimize an average of a fixed set of noisy realizations of the objective function.

Bayesian Optimization reinforcement-learning +1

Paper
Add Code

MUMBO: MUlti-task Max-value Bayesian Optimization

no code implementations • 22 Jun 2020 • Henry B. Moss, David S. Leslie, Paul Rayson

MUMBO is scalable and efficient, allowing multi-task Bayesian optimization to be deployed in problems with rich parameter and fidelity spaces.

Bayesian Optimization

Paper
Add Code

On Thompson Sampling for Smoother-than-Lipschitz Bandits

no code implementations • 8 Jan 2020 • James A. Grant, David S. Leslie

Thompson Sampling is a well established approach to bandit and reinforcement learning problems.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

FIESTA: Fast IdEntification of State-of-The-Art models using adaptive bandit algorithms

1 code implementation • ACL 2019 • Henry B. Moss, Andrew Moore, David S. Leslie, Paul Rayson

We present FIESTA, a model selection approach that significantly reduces the computational resources required to reliably identify state-of-the-art performance from large collections of candidate models.

Model Selection Sentiment Analysis

Paper
Code

Adaptive Sensor Placement for Continuous Spaces

no code implementations • 16 May 2019 • James A. Grant, Alexis Boukouvalas, Ryan-Rhys Griffiths, David S. Leslie, Sattar Vakili, Enrique Munoz de Cote

We consider the problem of adaptively placing sensors along an interval to detect stochastically-generated events.

Thompson Sampling

Paper
Add Code

Adaptive Policies for Perimeter Surveillance Problems

no code implementations • 4 Oct 2018 • James A. Grant, David S. Leslie, Kevin Glazebrook, Roberto Szechtman, Adam N. Letchford

Maximising the detection of intrusions is a fundamental and often critical aim of perimeter surveillance.

Paper
Add Code

Bandit learning in concave $N$-person games

no code implementations • 3 Oct 2018 • Mario Bravo, David S. Leslie, Panayotis Mertikopoulos

This paper examines the long-run behavior of learning with bandit feedback in non-cooperative concave games.

Stochastic Optimization

Paper
Add Code

Using J-K fold Cross Validation to Reduce Variance When Tuning NLP Models

1 code implementation • 19 Jun 2018 • Henry B. Moss, David S. Leslie, Paul Rayson

K-fold cross validation (CV) is a popular method for estimating the true performance of machine learning models, allowing model selection and parameter tuning.

Document Classification General Classification +4

Paper
Code

Combinatorial Multi-Armed Bandits with Filtered Feedback

no code implementations • 26 May 2017 • James A. Grant, David S. Leslie, Kevin Glazebrook, Roberto Szechtman

Motivated by problems in search and detection we present a solution to a Combinatorial Multi-Armed Bandit (CMAB) problem with both heavy-tailed reward distributions and a new class of feedback, filtered semibandit feedback.

Multi-Armed Bandits

Paper
Add Code

Game-theoretical control with continuous action sets

no code implementations • 1 Dec 2014 • Steven Perkins, Panayotis Mertikopoulos, David S. Leslie

To do so, we extend the theory of finite-dimensional two-timescale stochastic approximation to an infinite-dimensional, Banach space setting, and we prove that the continuous dynamics of the process converge to equilibrium in the case of potential games.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.