Search Results for author: Xiuyuan Lu

Found 16 papers, 5 papers with code

RLHF and IIA: Perverse Incentives

no code implementations • 2 Dec 2023 • Wanqiao Xu, Shi Dong, Xiuyuan Lu, Grace Lam, Zheng Wen, Benjamin Van Roy

Existing algorithms for reinforcement learning from human feedback (RLHF) can incentivize responses at odds with preferences because they are based on models that assume independence of irrelevant alternatives (IIA).

reinforcement-learning

Paper
Add Code

Approximate Thompson Sampling via Epistemic Neural Networks

1 code implementation • 18 Feb 2023 • Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy

Further, we demonstrate that the \textit{epinet} -- a small additive network that estimates uncertainty -- matches the performance of large ensembles at orders of magnitude lower computational cost.

Thompson Sampling

Paper
Code

Robustness of Epinets against Distributional Shifts

no code implementations • 1 Jul 2022 • Xiuyuan Lu, Ian Osband, Seyed Mohammad Asghari, Sven Gowal, Vikranth Dwaracherla, Zheng Wen, Benjamin Van Roy

However, these improvements are relatively small compared to the outstanding issues in distributionally-robust deep learning.

Paper
Add Code

Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping

no code implementations • 8 Jun 2022 • Vikranth Dwaracherla, Zheng Wen, Ian Osband, Xiuyuan Lu, Seyed Mohammad Asghari, Benjamin Van Roy

In machine learning, an agent needs to estimate uncertainty to efficiently explore and adapt and to make effective decisions.

Paper
Add Code

An Analysis of Ensemble Sampling

no code implementations • 2 Mar 2022 • Chao Qin, Zheng Wen, Xiuyuan Lu, Benjamin Van Roy

Ensemble sampling serves as a practical approximation to Thompson sampling when maintaining an exact posterior distribution over model parameters is computationally intractable.

Thompson Sampling

Paper
Add Code

Evaluating High-Order Predictive Distributions in Deep Learning

1 code implementation • 28 Feb 2022 • Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Xiuyuan Lu, Benjamin Van Roy

Previous work has developed methods for assessing low-order predictive distributions with inputs sampled i. i. d.

regression Vocal Bursts Intensity Prediction

188

Paper
Code

Event-based Motion Segmentation by Cascaded Two-Level Multi-Model Fitting

no code implementations • 5 Nov 2021 • Xiuyuan Lu, Yi Zhou, Shaojie Shen

In this paper, we present a cascaded two-level multi-model fitting method for identifying independently moving objects (i. e., the motion segmentation problem) with a monocular event camera.

Clustering Motion Segmentation +1

Paper
Add Code

The Neural Testbed: Evaluating Joint Predictions

1 code implementation • 9 Oct 2021 • Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Botao Hao, Morteza Ibrahimi, Dieterich Lawson, Xiuyuan Lu, Brendan O'Donoghue, Benjamin Van Roy

Predictive distributions quantify uncertainties ignored by point estimates.

188

Paper
Code

Evaluating Predictive Distributions: Does Bayesian Deep Learning Work?

no code implementations • 29 Sep 2021 • Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Xiuyuan Lu, Morteza Ibrahimi, Vikranth Dwaracherla, Dieterich Lawson, Brendan O'Donoghue, Botao Hao, Benjamin Van Roy

This paper introduces \textit{The Neural Testbed}, which provides tools for the systematic evaluation of agents that generate such predictions.

Uncertainty Quantification

Paper
Add Code

From Predictions to Decisions: The Importance of Joint Predictive Distributions

no code implementations • 20 Jul 2021 • Zheng Wen, Ian Osband, Chao Qin, Xiuyuan Lu, Morteza Ibrahimi, Vikranth Dwaracherla, Mohammad Asghari, Benjamin Van Roy

A fundamental challenge for any intelligent system is prediction: given some inputs, can you predict corresponding outcomes?

Multi-Armed Bandits Thompson Sampling

Paper
Add Code

Epistemic Neural Networks

1 code implementation • NeurIPS 2023 • Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy

We introduce the epinet: an architecture that can supplement any conventional neural network, including large pretrained models, and can be trained with modest incremental computation to estimate uncertainty.

269

Paper
Code

Reinforcement Learning, Bit by Bit

no code implementations • 6 Mar 2021 • Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen

To illustrate concepts, we design simple agents that build on them and present computational results that highlight data efficiency.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Event-based Motion Segmentation with Spatio-Temporal Graph Cuts

1 code implementation • 16 Dec 2020 • Yi Zhou, Guillermo Gallego, Xiuyuan Lu, SiQi Liu, Shaojie Shen

We develop a method to identify independently moving objects acquired with an event-based camera, i. e., to solve the event-based motion segmentation problem.

Motion Segmentation Scene Understanding

Paper
Code

Hypermodels for Exploration

no code implementations • ICLR 2020 • Vikranth Dwaracherla, Xiuyuan Lu, Morteza Ibrahimi, Ian Osband, Zheng Wen, Benjamin Van Roy

This generalizes and extends the use of ensembles to approximate Thompson sampling.

Thompson Sampling

Paper
Add Code

Information-Theoretic Confidence Bounds for Reinforcement Learning

no code implementations • NeurIPS 2019 • Xiuyuan Lu, Benjamin Van Roy

We integrate information-theoretic concepts into the design and analysis of optimistic algorithms and Thompson sampling.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Ensemble Sampling

no code implementations • NeurIPS 2017 • Xiuyuan Lu, Benjamin Van Roy

Thompson sampling has emerged as an effective heuristic for a broad range of online decision problems.

Thompson Sampling

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.