no code implementations • 18 Mar 2024 • Matthew Zurek, Yudong Chen
Our result is the first that is minimax optimal (up to log factors) in all parameters $S, A, H$ and $\epsilon$, improving on existing work that either assumes uniformly bounded mixing times for all policies or has suboptimal dependence on the parameters.
no code implementations • 22 Nov 2023 • Matthew Zurek, Yudong Chen
Our result is the first that is minimax optimal (up to log factors) in all parameters $S, A, H$ and $\varepsilon$, improving on existing work that either assumes uniformly bounded mixing times for all policies or has suboptimal dependence on the parameters.
no code implementations • 29 Aug 2023 • Matthew Zurek, Yudong Chen
Our gap-free clustering procedure also leads to improved algorithms for recursive clustering.
no code implementations • 2 Jun 2023 • Brahma S. Pavse, Matthew Zurek, Yudong Chen, Qiaomin Xie, Josiah P. Hanna
This latter objective is called stability and is especially important when the state space is unbounded, such that the states can be arbitrarily far from each other and the agent can drift far away from the desired states.
no code implementations • 23 Aug 2022 • Gaurav R. Ghosal, Matthew Zurek, Daniel S. Brown, Anca D. Dragan
In this work, we advocate that grounding the rationality coefficient in real data for each feedback type, rather than assuming a default value, has a significant positive effect on reward learning.
no code implementations • 14 Apr 2021 • Matthew Zurek, Andreea Bobu, Daniel S. Brown, Anca D. Dragan
Shared autonomy enables robots to infer user intent and assist in accomplishing it.