no code implementations • 17 Apr 2024 • Ameesh Shah, Cameron Voloshin, Chenxi Yang, Abhinav Verma, Swarat Chaudhuri, Sanjit A. Seshia
In our work, we consider the setting where the task is specified by an LTL objective and there is an additional scalar reward that we need to optimize.
no code implementations • 3 Mar 2023 • Cameron Voloshin, Abhinav Verma, Yisong Yue
Linear temporal logic (LTL) offers a simplified way of specifying tasks for policy optimization that may otherwise be difficult to describe with scalar reward functions.
no code implementations • 20 Jun 2022 • Cameron Voloshin, Hoang M. Le, Swarat Chaudhuri, Yisong Yue
We study the problem of policy optimization (PO) with linear temporal logic (LTL) constraints.
no code implementations • 2 Mar 2021 • Cameron Voloshin, Nan Jiang, Yisong Yue
We present a novel off-policy loss function for learning a transition model in model-based reinforcement learning.
3 code implementations • 15 Nov 2019 • Cameron Voloshin, Hoang M. Le, Nan Jiang, Yisong Yue
We offer an experimental benchmark and empirical study for off-policy policy evaluation (OPE) in reinforcement learning, which is a key problem in many safety critical applications.
2 code implementations • 20 Mar 2019 • Hoang M. Le, Cameron Voloshin, Yisong Yue
When learning policies for real-world domains, two important questions arise: (i) how to efficiently use pre-collected off-policy, non-optimal behavior data; and (ii) how to mediate among different competing objectives and constraints.