no code implementations • 29 Apr 2023 • Kefan Dong, Tengyu Ma
Our key technical novelty is to prove that the degree-$k$ spherical harmonics components of a function from Gaussian random field cannot be spiky in that their $L_\infty$/$L_2$ ratios are upperbounded by $O(d \sqrt{\ln k})$ with high probability.
no code implementations • 26 Jan 2023 • Kefan Dong, Yannis Flet-Berliac, Allen Nie, Emma Brunskill
We present a model-based offline reinforcement learning policy performance lower bound that explicitly captures dynamics model misspecification and distribution mismatch and we propose an empirical algorithm for optimal offline policy selection.
no code implementations • 21 Nov 2022 • Kefan Dong, Tengyu Ma
The question is very challenging because even two-layer neural networks cannot be guaranteed to extrapolate outside the support of the training distribution without further assumptions on the domain shift.
no code implementations • 6 Jun 2022 • Kefan Dong, Tengyu Ma
Past research on interactive decision making problems (bandits, reinforcement learning, etc.)
no code implementations • NeurIPS 2021 • Andrea Zanette, Kefan Dong, Jonathan Lee, Emma Brunskill
In the stochastic linear contextual bandit setting there exist several minimax procedures for exploration with policies that are reactive to the data being acquired.
no code implementations • NeurIPS 2021 • Kefan Dong, Jiaqi Yang, Tengyu Ma
This paper studies model-based bandit and reinforcement learning (RL) with nonlinear function approximations.
no code implementations • 21 Aug 2020 • Yuanhao Wang, Kefan Dong
We consider the adversarial Markov Decision Process (MDP) problem, where the rewards for the MDP can be adversarially chosen, and the transition function can be either known or unknown.
no code implementations • ICML 2020 • Kefan Dong, Yingkai Li, Qin Zhang, Yuan Zhou
We also present the ESUCB algorithm with item switching cost $O(N \log^2 T)$.
1 code implementation • ICML 2020 • Kefan Dong, Yuping Luo, Tengyu Ma
We compare the model-free reinforcement learning with the model-based approaches through the lens of the expressive power of neural networks for policies, $Q$-functions, and dynamics.
1 code implementation • 25 Sep 2019 • Kefan Dong, Yuping Luo, Tengyu Ma
We compare the model-free reinforcement learning with the model-based approaches through the lens of the expressive power of neural networks for policies, $Q$-functions, and dynamics.
no code implementations • 5 Sep 2019 • Kefan Dong, Jian Peng, Yining Wang, Yuan Zhou
Our learning algorithm, Adaptive Value-function Elimination (AVE), is inspired by the policy elimination algorithm proposed in (Jiang et al., 2017), known as OLIVE.
1 code implementation • NeurIPS 2019 • Zhizhou Ren, Kefan Dong, Yuan Zhou, Qiang Liu, Jian Peng
Goal-oriented reinforcement learning has recently been a practical framework for robotic manipulation tasks, in which an agent is required to reach a certain goal defined by a function on the state space.
no code implementations • ICLR 2020 • Kefan Dong, Yuanhao Wang, Xiaoyu Chen, Li-Wei Wang
A fundamental question in reinforcement learning is whether model-free algorithms are sample efficient.