Search Results for author: Seyed Mohammad Asghari

Found 11 papers, 5 papers with code

Efficient Exploration for LLMs

no code implementations • 1 Feb 2024 • Vikranth Dwaracherla, Seyed Mohammad Asghari, Botao Hao, Benjamin Van Roy

We present evidence of substantial benefit from efficient exploration in gathering human feedback to improve large language models.

Efficient Exploration Thompson Sampling

Paper
Add Code

Approximate Thompson Sampling via Epistemic Neural Networks

1 code implementation • 18 Feb 2023 • Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy

Further, we demonstrate that the \textit{epinet} -- a small additive network that estimates uncertainty -- matches the performance of large ensembles at orders of magnitude lower computational cost.

Thompson Sampling

Paper
Code

Fine-Tuning Language Models via Epistemic Neural Networks

1 code implementation • 3 Nov 2022 • Ian Osband, Seyed Mohammad Asghari, Benjamin Van Roy, Nat McAleese, John Aslanides, Geoffrey Irving

Language models often pre-train on large unsupervised text corpora, then fine-tune on additional task-specific data.

Active Learning Language Modelling

188

Paper
Code

Robustness of Epinets against Distributional Shifts

no code implementations • 1 Jul 2022 • Xiuyuan Lu, Ian Osband, Seyed Mohammad Asghari, Sven Gowal, Vikranth Dwaracherla, Zheng Wen, Benjamin Van Roy

However, these improvements are relatively small compared to the outstanding issues in distributionally-robust deep learning.

Paper
Add Code

Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping

no code implementations • 8 Jun 2022 • Vikranth Dwaracherla, Zheng Wen, Ian Osband, Xiuyuan Lu, Seyed Mohammad Asghari, Benjamin Van Roy

In machine learning, an agent needs to estimate uncertainty to efficiently explore and adapt and to make effective decisions.

Paper
Add Code

Evaluating High-Order Predictive Distributions in Deep Learning

1 code implementation • 28 Feb 2022 • Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Xiuyuan Lu, Benjamin Van Roy

Previous work has developed methods for assessing low-order predictive distributions with inputs sampled i. i. d.

regression Vocal Bursts Intensity Prediction

188

Paper
Code

The Neural Testbed: Evaluating Joint Predictions

1 code implementation • 9 Oct 2021 • Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Botao Hao, Morteza Ibrahimi, Dieterich Lawson, Xiuyuan Lu, Brendan O'Donoghue, Benjamin Van Roy

Predictive distributions quantify uncertainties ignored by point estimates.

188

Paper
Code

Evaluating Predictive Distributions: Does Bayesian Deep Learning Work?

no code implementations • 29 Sep 2021 • Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Xiuyuan Lu, Morteza Ibrahimi, Vikranth Dwaracherla, Dieterich Lawson, Brendan O'Donoghue, Botao Hao, Benjamin Van Roy

This paper introduces \textit{The Neural Testbed}, which provides tools for the systematic evaluation of agents that generate such predictions.

Uncertainty Quantification

Paper
Add Code

Epistemic Neural Networks

1 code implementation • NeurIPS 2023 • Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy

We introduce the epinet: an architecture that can supplement any conventional neural network, including large pretrained models, and can be trained with modest incremental computation to estimate uncertainty.

268

Paper
Code

Regret Bounds for Decentralized Learning in Cooperative Multi-Agent Dynamical Systems

no code implementations • 27 Jan 2020 • Seyed Mohammad Asghari, Yi Ouyang, Ashutosh Nayyar

This allows the agents to achieve a regret within $O(\sqrt{T})$ of the regret of the auxiliary single-agent problem.

Multi-agent Reinforcement Learning

Paper
Add Code

Learning to Code: Coded Caching via Deep Reinforcement Learning

no code implementations • 9 Dec 2019 • Navid Naderializadeh, Seyed Mohammad Asghari

We consider a system comprising a file library and a network with a server and multiple users equipped with cache memories.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.