no code implementations • 6 Feb 2024 • Li Guo, Keith Ross, Zifan Zhao, George Andriopoulos, Shuyang Ling, Yufeng Xu, Zixuan Dong
We first show empirically that models trained with label smoothing converge faster to neural collapse solutions and attain a stronger level of neural collapse.
no code implementations • 1 Oct 2023 • Zecheng Wang, Che Wang, Zixuan Dong, Keith Ross
Recently, it has been shown that for offline deep reinforcement learning (DRL), pre-training Decision Transformer with a large language corpus can improve downstream performance (Reid et al., 2022).
no code implementations • 7 Sep 2022 • Zixuan Dong, Che Wang, Keith Ross
We nevertheless show that for a large class of MDPs, which includes stochastic MDPs such as blackjack and deterministic MDPs such as Go, the Q-function in MC-UCB converges almost surely to the optimal Q function.
1 code implementation • 17 Feb 2022 • Che Wang, Xufang Luo, Keith Ross, Dongsheng Li
We propose VRL3, a powerful data-driven framework with a simple design for solving challenging visual deep reinforcement learning (DRL) tasks.
6 code implementations • ICLR 2021 • Xinyue Chen, Che Wang, Zijian Zhou, Keith Ross
Using a high Update-To-Data (UTD) ratio, model-based methods have recently achieved much higher sample efficiency than previous model-free methods for continuous-action DRL benchmarks.
no code implementations • ICLR 2022 • Che Wang, Shuhan Yuan, Kai Shao, Keith Ross
A simple and natural algorithm for reinforcement learning (RL) is Monte Carlo Exploring Starts (MCES), where the Q-function is estimated by averaging the Monte Carlo returns, and the policy is improved by choosing actions that maximize the current estimate of the Q-function.
1 code implementation • NeurIPS 2020 • Xinyue Chen, Zijian Zhou, Zheng Wang, Che Wang, Yanqiu Wu, Keith Ross
There has recently been a surge in research in batch Deep Reinforcement Learning (DRL), which aims for learning a high-performing policy from a given dataset without additional interactions with the environment.
3 code implementations • ICML 2020 • Che Wang, Yanqiu Wu, Quan Vuong, Keith Ross
We aim to develop off-policy DRL algorithms that not only exceed state-of-the-art performance but are also simple and minimalistic.
no code implementations • NeurIPS 2020 • Jiachen Li, Quan Vuong, Shuang Liu, Minghua Liu, Kamil Ciosek, Keith Ross, Henrik Iskov Christensen, Hao Su
To perform well, the policy must infer the task identity from collected transitions by modelling its dependency on states, actions and rewards.
no code implementations • 25 Sep 2019 • Che Wang, Yanqiu Wu, Quan Vuong, Keith Ross
The field of Deep Reinforcement Learning (DRL) has recently seen a surge in the popularity of maximum entropy reinforcement learning algorithms.
3 code implementations • 10 Jun 2019 • Che Wang, Keith Ross
The ERE algorithm samples more aggressively from recent experience, and also orders the updates to ensure that updates from old data do not overwrite updates from new data.