Search Results for author: Sapana Chaudhary

Found 5 papers, 2 papers with code

Pedagogical Alignment of Large Language Models

1 code implementation7 Feb 2024 Shashank Sonkar, Kangqi Ni, Sapana Chaudhary, Richard G. Baraniuk

Building on this perspective, we propose a novel approach for constructing a reward dataset specifically designed for the pedagogical alignment of LLMs.

Enhanced Meta Reinforcement Learning using Demonstrations in Sparse Reward Environments

1 code implementation26 Sep 2022 Desik Rengarajan, Sapana Chaudhary, Jaewon Kim, Dileep Kalathil, Srinivas Shakkottai

Meta reinforcement learning (Meta-RL) is an approach wherein the experience gained from solving a variety of tasks is distilled into a meta-policy.

Meta Reinforcement Learning reinforcement-learning +1

Smooth Imitation Learning via Smooth Costs and Smooth Policies

no code implementations3 Nov 2021 Sapana Chaudhary, Balaraman Ravindran

We call our new smooth IL algorithm \textit{Smooth Policy and Cost Imitation Learning} (SPaCIL, pronounced 'Special').

Continuous Control Imitation Learning +1

Safe Online Convex Optimization with Unknown Linear Safety Constraints

no code implementations14 Nov 2021 Sapana Chaudhary, Dileep Kalathil

We study the problem of safe online convex optimization, where the action at each time step must satisfy a set of linear safety constraints.

Dynamic Regret Analysis of Safe Distributed Online Optimization for Convex and Non-convex Problems

no code implementations23 Feb 2023 Ting-Jui Chang, Sapana Chaudhary, Dileep Kalathil, Shahin Shahrampour

We prove that for convex functions, D-Safe-OGD achieves a dynamic regret bound of $O(T^{2/3} \sqrt{\log T} + T^{1/3}C_T^*)$, where $C_T^*$ denotes the path-length of the best minimizer sequence.

Cannot find the paper you are looking for? You can Submit a new open access paper.