Search Results for author: Bhavya Sukhija

Found 13 papers, 6 papers with code

ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning

no code implementations12 Oct 2024 Yarden As, Bhavya Sukhija, Lenart Treven, Carmelo Sferrazza, Stelian Coros, Andreas Krause

Under regularity assumptions on the constraints and dynamics, we show that ActSafe guarantees safety during learning while also obtaining a near-optimal policy in finite time.

Efficient Exploration reinforcement-learning +3

Transductive Active Learning with Application to Safe Bayesian Optimization

1 code implementation ICML Workshop on Aligning Reinforcement Learning Experimentalists and Theorists 2024 Jonas Hübotter, Bhavya Sukhija, Lenart Treven, Yarden As, Andreas Krause

We analyze Safe BO under the lens of a generalization of active learning with concrete prediction targets where sampling is restricted to an accessible region of the domain, while prediction targets may lie outside this region.

Active Learning Bayesian Optimization +4

NeoRL: Efficient Exploration for Nonepisodic RL

no code implementations3 Jun 2024 Bhavya Sukhija, Lenart Treven, Florian Dörfler, Stelian Coros, Andreas Krause

We study the problem of nonepisodic reinforcement learning (RL) for nonlinear dynamical systems, where the system dynamics are unknown and the RL agent has to learn from a single trajectory, i. e., without resets.

Efficient Exploration Reinforcement Learning (RL)

Safe Exploration Using Bayesian World Models and Log-Barrier Optimization

no code implementations9 May 2024 Yarden As, Bhavya Sukhija, Andreas Krause

A major challenge in deploying reinforcement learning in online tasks is ensuring that safety is maintained throughout the learning process.

Safe Exploration

Active Few-Shot Fine-Tuning

no code implementations13 Feb 2024 Jonas Hübotter, Bhavya Sukhija, Lenart Treven, Yarden As, Andreas Krause

We study the question: How can we select the right data for fine-tuning to a specific task?

Active Learning Generalization Bounds +1

Transductive Active Learning: Theory and Applications

2 code implementations13 Feb 2024 Jonas Hübotter, Bhavya Sukhija, Lenart Treven, Yarden As, Andreas Krause

We study a generalization of classical active learning to real-world settings with concrete prediction targets where sampling is restricted to an accessible region of the domain, while prediction targets may lie outside this region.

Active Learning Bayesian Optimization +2

Data-Efficient Task Generalization via Probabilistic Model-based Meta Reinforcement Learning

no code implementations13 Nov 2023 Arjun Bhardwaj, Jonas Rothfuss, Bhavya Sukhija, Yarden As, Marco Hutter, Stelian Coros, Andreas Krause

We introduce PACOH-RL, a novel model-based Meta-Reinforcement Learning (Meta-RL) algorithm designed to efficiently adapt control policies to changing dynamics.

Meta-Learning Meta Reinforcement Learning +2

Tuning Legged Locomotion Controllers via Safe Bayesian Optimization

1 code implementation12 Jun 2023 Daniel Widmer, Dongho Kang, Bhavya Sukhija, Jonas Hübotter, Andreas Krause, Stelian Coros

This paper presents a data-driven strategy to streamline the deployment of model-based controllers in legged robotic hardware platforms.

Bayesian Optimization Efficient Exploration

Hallucinated Adversarial Control for Conservative Offline Policy Evaluation

1 code implementation2 Mar 2023 Jonas Rothfuss, Bhavya Sukhija, Tobias Birchler, Parnian Kassraie, Andreas Krause

We study the problem of conservative off-policy evaluation (COPE) where given an offline dataset of environment interactions, collected by other agents, we seek to obtain a (tight) lower bound on a policy's performance.

continuous-control Continuous Control +2

Gradient-Based Trajectory Optimization With Learned Dynamics

no code implementations9 Apr 2022 Bhavya Sukhija, Nathanael Köhler, Miguel Zamora, Simon Zimmermann, Sebastian Curi, Andreas Krause, Stelian Coros

In our hardware experiments, we demonstrate that our learned model can represent complex dynamics for both the Spot and Radio-controlled (RC) car, and gives good performance in combination with trajectory optimization methods.

GoSafeOpt: Scalable Safe Exploration for Global Optimization of Dynamical Systems

1 code implementation24 Jan 2022 Bhavya Sukhija, Matteo Turchetta, David Lindner, Andreas Krause, Sebastian Trimpe, Dominik Baumann

Learning optimal control policies directly on physical systems is challenging since even a single failure can lead to costly hardware damage.

Safe Exploration

Cannot find the paper you are looking for? You can Submit a new open access paper.