no code implementations • 23 Feb 2023 • Ting-Jui Chang, Sapana Chaudhary, Dileep Kalathil, Shahin Shahrampour
We prove that for convex functions, D-Safe-OGD achieves a dynamic regret bound of $O(T^{2/3} \sqrt{\log T} + T^{1/3}C_T^*)$, where $C_T^*$ denotes the path-length of the best minimizer sequence.
1 code implementation • 26 Sep 2022 • Desik Rengarajan, Sapana Chaudhary, Jaewon Kim, Dileep Kalathil, Srinivas Shakkottai
Meta reinforcement learning (Meta-RL) is an approach wherein the experience gained from solving a variety of tasks is distilled into a meta-policy.
no code implementations • 14 Nov 2021 • Sapana Chaudhary, Dileep Kalathil
We study the problem of safe online convex optimization, where the action at each time step must satisfy a set of linear safety constraints.
no code implementations • 3 Nov 2021 • Sapana Chaudhary, Balaraman Ravindran
We call our new smooth IL algorithm \textit{Smooth Policy and Cost Imitation Learning} (SPaCIL, pronounced 'Special').