Search Results for author: Keith Ross

Found 11 papers, 5 papers with code

Cross Entropy versus Label Smoothing: A Neural Collapse Perspective

no code implementations • 6 Feb 2024 • Li Guo, Keith Ross, Zifan Zhao, George Andriopoulos, Shuyang Ling, Yufeng Xu, Zixuan Dong

We first show empirically that models trained with label smoothing converge faster to neural collapse solutions and attain a stronger level of neural collapse.

Paper
Add Code

Pre-training with Synthetic Data Helps Offline Reinforcement Learning

no code implementations • 1 Oct 2023 • Zecheng Wang, Che Wang, Zixuan Dong, Keith Ross

Recently, it has been shown that for offline deep reinforcement learning (DRL), pre-training Decision Transformer with a large language corpus can improve downstream performance (Reid et al., 2022).

D4RL Q-Learning +1

Paper
Add Code

On the Convergence of Monte Carlo UCB for Random-Length Episodic MDPs

no code implementations • 7 Sep 2022 • Zixuan Dong, Che Wang, Keith Ross

We nevertheless show that for a large class of MDPs, which includes stochastic MDPs such as blackjack and deterministic MDPs such as Go, the Q-function in MC-UCB converges almost surely to the optimal Q function.

Open-Ended Question Answering Q-Learning

Paper
Add Code

VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning

1 code implementation • 17 Feb 2022 • Che Wang, Xufang Luo, Keith Ross, Dongsheng Li

We propose VRL3, a powerful data-driven framework with a simple design for solving challenging visual deep reinforcement learning (DRL) tasks.

Offline RL reinforcement-learning +1

325

Paper
Code

Randomized Ensembled Double Q-Learning: Learning Fast Without a Model

6 code implementations • ICLR 2021 • Xinyue Chen, Che Wang, Zijian Zhou, Keith Ross

Using a high Update-To-Data (UTD) ratio, model-based methods have recently achieved much higher sample efficiency than previous model-free methods for continuous-action DRL benchmarks.

Q-Learning

1,769

Paper
Code

On the Convergence of the Monte Carlo Exploring Starts Algorithm for Reinforcement Learning

no code implementations • ICLR 2022 • Che Wang, Shuhan Yuan, Kai Shao, Keith Ross

A simple and natural algorithm for reinforcement learning (RL) is Monte Carlo Exploring Starts (MCES), where the Q-function is estimated by averaging the Monte Carlo returns, and the policy is improved by choosing actions that maximize the current estimate of the Q-function.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning

1 code implementation • NeurIPS 2020 • Xinyue Chen, Zijian Zhou, Zheng Wang, Che Wang, Yanqiu Wu, Keith Ross

There has recently been a surge in research in batch Deep Reinforcement Learning (DRL), which aims for learning a high-performing policy from a given dataset without additional interactions with the environment.

Imitation Learning Q-Learning +2

Paper
Code

Striving for Simplicity and Performance in Off-Policy DRL: Output Normalization and Non-Uniform Sampling

3 code implementations • ICML 2020 • Che Wang, Yanqiu Wu, Quan Vuong, Keith Ross

We aim to develop off-policy DRL algorithms that not only exceed state-of-the-art performance but are also simple and minimalistic.

Continuous Control

Paper
Code

Multi-task Batch Reinforcement Learning with Metric Learning

no code implementations • NeurIPS 2020 • Jiachen Li, Quan Vuong, Shuang Liu, Minghua Liu, Kamil Ciosek, Keith Ross, Henrik Iskov Christensen, Hao Su

To perform well, the policy must infer the task identity from collected transitions by modelling its dependency on states, actions and rewards.

Meta Reinforcement Learning Metric Learning +2

Paper
Add Code

Towards Simplicity in Deep Reinforcement Learning: Streamlined Off-Policy Learning

no code implementations • 25 Sep 2019 • Che Wang, Yanqiu Wu, Quan Vuong, Keith Ross

The field of Deep Reinforcement Learning (DRL) has recently seen a surge in the popularity of maximum entropy reinforcement learning algorithms.

Continuous Control reinforcement-learning +1

Paper
Add Code

Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the Past

3 code implementations • 10 Jun 2019 • Che Wang, Keith Ross

The ERE algorithm samples more aggressively from recent experience, and also orders the updates to ensure that updates from old data do not overwrite updates from new data.

Q-Learning reinforcement-learning +1

247

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.