no code implementations • EACL (HumEval) 2021 • Shaily Bhatt, Rahul Jain, Sandipan Dandapat, Sunayana Sitaram
We conduct experiments for evaluating an offensive content detection system and use a data augmentation technique for improving the model using insights from Checklist.
no code implementations • 17 Oct 2023 • Dengwang Tang, Rahul Jain, Botao Hao, Zheng Wen
In this paper, we study the problem of efficient online reinforcement learning in the infinite horizon setting when there is an offline dataset to start with.
no code implementations • 16 Oct 2023 • Dengwang Tang, Rahul Jain, Ashutosh Nayyar, Pierluigi Nuzzo
We propose a Posterior Sampling-based reinforcement learning algorithm for POMDPs (PS4POMDPs), which is much simpler and more implementable compared to state-of-the-art optimism-based online learning algorithms for POMDPs.
no code implementations • 15 Oct 2023 • Rahul Jain, Anoushka Saha, Gourav Daga, Durba Bhattacharya, Madhura Das Gupta, Sourav Chowdhury, Suparna Roychowdhury
Type 2 diabetes mellitus represents a prevalent and widespread global health concern, necessitating a comprehensive assessment of its risk factors.
no code implementations • 24 Aug 2023 • Rishabh Agrawal, Nathan Dahlin, Rahul Jain, Ashutosh Nayyar
Classical methods such as behavioral cloning and inverse reinforcement learning are highly sensitive to estimation errors, a problem that is particularly acute in continuous state space problems.
no code implementations • 24 May 2023 • Krishna C. Kalagarla, Dhruva Kartik, Dongming Shen, Rahul Jain, Ashutosh Nayyar, Pierluigi Nuzzo
In this paper, we first introduce an optimal control theory for partially observable Markov decision processes (POMDPs) with finite linear temporal logic constraints.
no code implementations • 11 Apr 2023 • Kevin Chang, Nathan Dahlin, Rahul Jain, Pierluigi Nuzzo
Over the past decade, neural network (NN)-based controllers have demonstrated remarkable efficacy in a variety of decision-making tasks.
1 code implementation • 10 Apr 2023 • Dengwang Tang, Ashutosh Nayyar, Rahul Jain
The Common Information (CI) approach provides a systematic way to transform a multi-agent stochastic control problem to a single-agent partially observed Markov decision problem (POMDP) called the coordinator's POMDP.
no code implementations • 20 Mar 2023 • Botao Hao, Rahul Jain, Dengwang Tang, Zheng Wen
We first propose an Informed Posterior Sampling-based RL (iPSRL) algorithm that uses the offline dataset, and information about the expert's behavioral policy used to generate the offline dataset.
no code implementations • 7 Feb 2023 • Botao Hao, Rahul Jain, Tor Lattimore, Benjamin Van Roy, Zheng Wen
This offers insight into how pretraining can greatly improve online performance and how the degree of improvement increases with the expert's competence level.
no code implementations • 2 Feb 2023 • Akhil Agnihotri, Rahul Jain, Haipeng Luo
In this paper, we introduce a new policy optimization with function approximation algorithm for constrained MDPs with the average criterion.
no code implementations • 27 Jan 2023 • Krishna C Kalagarla, Rahul Jain, Pierluigi Nuzzo
Constrained Markov decision processes (CMDPs) model scenarios of sequential decision making with multiple objectives that are increasingly important in many applications.
no code implementations • 12 Nov 2022 • Namasivayam Kalithasan, Himanshu Singh, Vishal Bindal, Arnav Tuli, Vishwajeet Agrawal, Rahul Jain, Parag Singla, Rohan Paul
Given a natural language instruction and an input scene, our goal is to train a model to output a manipulation program that can be executed by the robot.
no code implementations • 8 Sep 2022 • Dhruva Kartik, Sagar Sudhakara, Rahul Jain, Ashutosh Nayyar
We consider a multi-agent system in which a decentralized team of agents controls a stochastic system in the presence of an adversary.
no code implementations • 17 Mar 2022 • Krishna C. Kalagarla, Dhruva Kartik, Dongming Shen, Rahul Jain, Ashutosh Nayyar, Pierluigi Nuzzo
Autonomous agents often operate in scenarios where the state is partially observed.
no code implementations • 31 Jan 2022 • Liyu Chen, Rahul Jain, Haipeng Luo
We study regret minimization for infinite-horizon average-reward Markov Decision Processes (MDPs) under cost constraints.
no code implementations • 27 Sep 2021 • Krishna C. Kalagarla, Rahul Jain, Pierluigi Nuzzo
We present a model-free reinforcement learning algorithm to find an optimal policy for a finite-horizon Markov decision process while guaranteeing a desired lower bound on the probability of satisfying a signal temporal logic (STL) specification.
no code implementations • 8 Sep 2021 • Mehdi Jafarnia-Jahromi, Rahul Jain, Ashutosh Nayyar
In this paper, we propose Posterior Sampling Reinforcement Learning for Zero-sum Stochastic Games (PSRL-ZSG), the first online learning algorithm that achieves Bayesian regret bound of $O(HS\sqrt{AT})$ in the infinite-horizon zero-sum stochastic games with average-reward criterion.
no code implementations • 7 Sep 2021 • William Chang, Mehdi Jafarnia-Jahromi, Rahul Jain
For the first setting, we propose a UCB-inspired algorithm that achieves $O(\log T)$ regret whether the rewards are IID or Markovian.
no code implementations • NeurIPS 2021 • Liyu Chen, Mehdi Jafarnia-Jahromi, Rahul Jain, Haipeng Luo
We introduce a generic template for developing regret minimization algorithms in the Stochastic Shortest Path (SSP) model, which achieves minimax optimal regret as long as certain properties are ensured.
no code implementations • 9 Jun 2021 • Mehdi Jafarnia-Jahromi, Liyu Chen, Rahul Jain, Haipeng Luo
We consider the problem of online reinforcement learning for the Stochastic Shortest Path (SSP) problem modeled as an unknown MDP with an absorbing state.
no code implementations • 25 Feb 2021 • Mehdi Jafarnia-Jahromi, Rahul Jain, Ashutosh Nayyar
Learning optimal controllers for POMDPs when the model is unknown is harder.
no code implementations • 13 Jan 2021 • Sujoy Bhore, Rahul Jain
We show that for every $\epsilon> 0$, there exists a polynomial-time algorithm that can solve Reachability in an $n$ vertex directed penny graph, using $O(n^{1/4+\epsilon})$ space.
Computational Complexity Computational Geometry
no code implementations • 1 Nov 2020 • Krishna C. Kalagarla, Rahul Jain, Pierluigi Nuzzo
We present a method to find an optimal policy with respect to a reward function for a discounted Markov decision process under general linear temporal logic (LTL) specifications.
no code implementations • 28 Oct 2020 • Nathan Dahlin, Krishna Chaitanya Kalagarla, Nikhil Naik, Rahul Jain, Pierluigi Nuzzo
In an ever expanding set of research and application areas, deep neural networks (DNNs) set the bar for algorithm performance.
no code implementations • 23 Sep 2020 • Krishna C. Kalagarla, Rahul Jain, Pierluigi Nuzzo
Constrained Markov Decision Processes (CMDPs) formalize sequential decision-making problems whose objective is to minimize a cost function while satisfying constraints on various cost functions.
no code implementations • 23 Jul 2020 • Chen-Yu Wei, Mehdi Jafarnia-Jahromi, Haipeng Luo, Rahul Jain
We develop several new algorithms for learning Markov Decision Processes in an infinite-horizon average-reward setting with linear function approximation.
no code implementations • 8 Jun 2020 • Mehdi Jafarnia-Jahromi, Chen-Yu Wei, Rahul Jain, Haipeng Luo
Recently, model-free reinforcement learning has attracted research attention due to its simplicity, memory and computation efficiency, and the flexibility to combine with function approximation.
no code implementations • 8 Jun 2020 • Hiteshi Sharma, Rahul Jain
The key to success has been the use of deep neural networks used to approximate the policy and value function.
no code implementations • 30 Mar 2020 • Nathan Dahlin, Rahul Jain
A market consisting of a generator with thermal and renewable generation capability, a set of non-preemptive loads (i. e., loads which cannot be interrupted once started), and an independent system operator (ISO) is considered.
1 code implementation • ICML 2020 • Chen-Yu Wei, Mehdi Jafarnia-Jahromi, Haipeng Luo, Hiteshi Sharma, Rahul Jain
Model-free reinforcement learning is known to be memory and computation efficient and more amendable to large scale problems.
no code implementations • 4 Apr 2018 • Abhishek Gupta, Rahul Jain, Peter Glynn
In many branches of engineering, Banach contraction mapping theorem is employed to establish the convergence of certain deterministic algorithms.
no code implementations • NeurIPS 2017 • Yi Ouyang, Mukul Gagrani, Ashutosh Nayyar, Rahul Jain
This regret bound matches the best available bound for weakly communicating MDPs.
no code implementations • 4 May 2015 • Naumaan Nayyar, Dileep Kalathil, Rahul Jain
The objective is to design a policy that maximizes the expected reward over a time horizon for a single player setting and the sum of expected rewards for the multiplayer setting.
no code implementations • 30 Nov 2014 • Dileep Kalathil, Vivek S. Borkar, Rahul Jain
We propose a new simple and natural algorithm for learning the optimal Q-value function of a discounted-cost Markov Decision Process (MDP) when the transition kernels are unknown.
no code implementations • 3 Nov 2014 • Dileep Kalathil, Vivek Borkar, Rahul Jain
Firstly, we give a simple and computationally tractable strategy for approachability for Stackelberg stochastic games along the lines of Blackwell's.
no code implementations • 22 Nov 2010 • Yi Gai, Bhaskar Krishnamachari, Rahul Jain
Furthermore, these policies only require storage that grows linearly in the number of unknown parameters.