Search Results for author: Abhinav Verma

Found 10 papers, 4 papers with code

Deep Policy Optimization with Temporal Logic Constraints

no code implementations • 17 Apr 2024 • Ameesh Shah, Cameron Voloshin, Chenxi Yang, Abhinav Verma, Swarat Chaudhuri, Sanjit A. Seshia

In our work, we consider the setting where the task is specified by an LTL objective and there is an additional scalar reward that we need to optimize.

Reinforcement Learning (RL)

Paper
Add Code

Compositional Policy Learning in Stochastic Control Systems with Formal Guarantees

1 code implementation • NeurIPS 2023 • Đorđe Žikelić, Mathias Lechner, Abhinav Verma, Krishnendu Chatterjee, Thomas A. Henzinger

We also derive a tighter lower bound compared to previous work on the probability of reach-avoidance implied by a RASM, which is required to find a compositional policy with an acceptable probabilistic threshold for complex tasks with multiple edge policies.

Paper
Code

Eventual Discounting Temporal Logic Counterfactual Experience Replay

no code implementations • 3 Mar 2023 • Cameron Voloshin, Abhinav Verma, Yisong Yue

Linear temporal logic (LTL) offers a simplified way of specifying tasks for policy optimization that may otherwise be difficult to describe with scalar reward functions.

counterfactual Counterfactual Reasoning

Paper
Add Code

Neurosymbolic Reinforcement Learning with Formally Verified Exploration

1 code implementation • NeurIPS 2020 • Greg Anderson, Abhinav Verma, Isil Dillig, Swarat Chaudhuri

We present Revel, a partially neural reinforcement learning (RL) framework for provably safe exploration in continuous state and action spaces.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Learning Differentiable Programs with Admissible Neural Heuristics

1 code implementation • NeurIPS 2020 • Ameesh Shah, Eric Zhan, Jennifer J. Sun, Abhinav Verma, Yisong Yue, Swarat Chaudhuri

This relaxed program is differentiable and can be trained end-to-end, and the resulting training loss is an approximately admissible heuristic that can guide the combinatorial search.

Paper
Code

Imitation-Projected Programmatic Reinforcement Learning

no code implementations • NeurIPS 2019 • Abhinav Verma, Hoang M. Le, Yisong Yue, Swarat Chaudhuri

First, we view our learning task as optimization in policy space, modulo the constraint that the desired policy has a programmatic representation, and solve this optimization problem using a form of mirror descent that takes a gradient step into the unconstrained policy space and then projects back onto the constrained space.

Continuous Control Imitation Learning +3

Paper
Add Code

Control Regularization for Reduced Variance Reinforcement Learning

1 code implementation • 14 May 2019 • Richard Cheng, Abhinav Verma, Gabor Orosz, Swarat Chaudhuri, Yisong Yue, Joel W. Burdick

We show that functional regularization yields a bias-variance trade-off, and propose an adaptive tuning strategy to optimize this trade-off.

Continuous Control reinforcement-learning +1

Paper
Code

Finite Automata Can be Linearly Decoded from Language-Recognizing RNNs

no code implementations • ICLR 2019 • Joshua J. Michalenko, Ameesh Shah, Abhinav Verma, Swarat Chaudhuri, Ankit B. Patel

We study the internal representations that a recurrent neural network (RNN) uses while learning to recognize a regular formal language.

Clustering

Paper
Add Code

Representing Formal Languages: A Comparison Between Finite Automata and Recurrent Neural Networks

no code implementations • 27 Feb 2019 • Joshua J. Michalenko, Ameesh Shah, Abhinav Verma, Richard G. Baraniuk, Swarat Chaudhuri, Ankit B. Patel

We investigate the internal representations that a recurrent neural network (RNN) uses while learning to recognize a regular formal language.

Clustering

Paper
Add Code

Programmatically Interpretable Reinforcement Learning

no code implementations • ICML 2018 • Abhinav Verma, Vijayaraghavan Murali, Rishabh Singh, Pushmeet Kohli, Swarat Chaudhuri

Unlike the popular Deep Reinforcement Learning (DRL) paradigm, which represents policies by neural networks, PIRL represents policies using a high-level, domain-specific programming language.

Car Racing reinforcement-learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.