Search Results for author: Dipendra Misra

Found 33 papers, 13 papers with code

Provable Interactive Learning with Hindsight Instruction Feedback

no code implementations • 14 Apr 2024 • Dipendra Misra, Aldo Pacchiano, Robert E. Schapire

We study interactive learning in a setting where the agent has to generate a response (e. g., an action or trajectory) given a context and an instruction.

Paper
Add Code

Dataset Reset Policy Optimization for RLHF

2 code implementations • 12 Apr 2024 • Jonathan D. Chang, Wenhao Zhan, Owen Oertell, Kianté Brantley, Dipendra Misra, Jason D. Lee, Wen Sun

Motivated by the fact that offline preference dataset provides informative states (i. e., data that is preferred by the labelers), our new algorithm, Dataset Reset Policy Optimization (DR-PO), integrates the existing offline preference dataset into the online policy training procedure via dataset reset: it directly resets the policy optimizer to the states in the offline dataset, instead of always starting from the initial state distribution.

Reinforcement Learning (RL)

131

Paper
Code

Towards Principled Representation Learning from Videos for Reinforcement Learning

no code implementations • 20 Mar 2024 • Dipendra Misra, Akanksha Saran, Tengyang Xie, Alex Lamb, John Langford

We study two types of settings: one where there is iid noise in the observation, and a more challenging setting where there is also the presence of exogenous noise, which is non-iid noise that is temporally correlated, such as the motion of people or cars in the background.

Contrastive Learning reinforcement-learning +1

Paper
Add Code

Policy Improvement using Language Feedback Models

no code implementations • 12 Feb 2024 • Victor Zhong, Dipendra Misra, Xingdi Yuan, Marc-Alexandre Côté

We introduce Language Feedback Models (LFMs) that identify desirable behaviour - actions that help achieve tasks specified in the instruction - for imitation learning in instruction following.

Behavioural cloning Instruction Following

Paper
Add Code

The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction

1 code implementation • 21 Dec 2023 • Pratyusha Sharma, Jordan T. Ash, Dipendra Misra

Transformer-based Large Language Models (LLMs) have become a fixture in modern machine learning.

321

Paper
Code

LLF-Bench: Benchmark for Interactive Learning from Language Feedback

no code implementations • 11 Dec 2023 • Ching-An Cheng, Andrey Kolobov, Dipendra Misra, Allen Nie, Adith Swaminathan

We introduce a new benchmark, LLF-Bench (Learning from Language Feedback Benchmark; pronounced as "elf-bench"), to evaluate the ability of AI agents to interactively learn from natural language feedback and instructions.

Information Retrieval OpenAI Gym

Paper
Add Code

Learning to Generate Better Than Your LLM

1 code implementation • 20 Jun 2023 • Jonathan D. Chang, Kiante Brantley, Rajkumar Ramamurthy, Dipendra Misra, Wen Sun

In particular, we extend RL algorithms to allow them to interact with a dynamic black-box guide LLM and propose RL with guided feedback (RLGF), a suite of RL algorithms for LLM fine-tuning.

Conditional Text Generation reinforcement-learning +1

110

Paper
Code

Towards Data-Driven Offline Simulations for Online Reinforcement Learning

1 code implementation • 14 Nov 2022 • Shengpu Tang, Felipe Vieira Frujeri, Dipendra Misra, Alex Lamb, John Langford, Paul Mineiro, Sebastian Kochman

Modern decision-making systems, from robots to web recommendation engines, are expected to adapt: to user preferences, changing circumstances or even new tasks.

Decision Making reinforcement-learning +1

Paper
Code

Agent-Controller Representations: Principled Offline RL with Rich Exogenous Information

1 code implementation • 31 Oct 2022 • Riashat Islam, Manan Tomar, Alex Lamb, Yonathan Efroni, Hongyu Zang, Aniket Didolkar, Dipendra Misra, Xin Li, Harm van Seijen, Remi Tachet des Combes, John Langford

We find that contemporary representation learning techniques can fail on datasets where the noise is a complex and time dependent process, which is prevalent in practical applications.

Offline RL Reinforcement Learning (RL) +1

Paper
Code

Provable Safe Reinforcement Learning with Binary Feedback

1 code implementation • 26 Oct 2022 • Andrew Bennett, Dipendra Misra, Nathan Kallus

Many existing approaches to safe RL rely on receiving numeric safety feedback, but in many cases this feedback can only take binary values; that is, whether an action in a given state is safe or unsafe.

Active Learning reinforcement-learning +2

Paper
Code

Guaranteed Discovery of Control-Endogenous Latent States with Multi-Step Inverse Models

no code implementations • 17 Jul 2022 • Alex Lamb, Riashat Islam, Yonathan Efroni, Aniket Didolkar, Dipendra Misra, Dylan Foster, Lekan Molu, Rajan Chari, Akshay Krishnamurthy, John Langford

In many sequential decision-making tasks, the agent is not able to model the full complexity of the world, which consists of multitudes of relevant and irrelevant information.

Decision Making

Paper
Add Code

Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information

no code implementations • 9 Jun 2022 • Yonathan Efroni, Dylan J. Foster, Dipendra Misra, Akshay Krishnamurthy, John Langford

In real-world reinforcement learning applications the learner's observation space is ubiquitously high-dimensional with both relevant and irrelevant information about the task at hand.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Provably Sample-Efficient RL with Side Information about Latent Dynamics

no code implementations • 27 May 2022 • Yao Liu, Dipendra Misra, Miro Dudík, Robert E. Schapire

We study reinforcement learning (RL) in settings where observations are high-dimensional, but where an RL agent has access to abstract knowledge about the structure of the state space, as is the case, for example, when a robot is tasked to go to a specific room in a building using observations from its own camera, while having access to the floor plan.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Understanding Contrastive Learning Requires Incorporating Inductive Biases

no code implementations • 28 Feb 2022 • Nikunj Saunshi, Jordan Ash, Surbhi Goel, Dipendra Misra, Cyril Zhang, Sanjeev Arora, Sham Kakade, Akshay Krishnamurthy

Contrastive learning is a popular form of self-supervised learning that encourages augmentations (views) of the same input to have more similar representations compared to augmentations of different inputs.

Contrastive Learning Self-Supervised Learning

Paper
Add Code

Provable RL with Exogenous Distractors via Multistep Inverse Dynamics

no code implementations • 17 Oct 2021 • Yonathan Efroni, Dipendra Misra, Akshay Krishnamurthy, Alekh Agarwal, John Langford

We initiate the formal study of latent state discovery in the presence of such exogenous noise sources by proposing a new model, the Exogenous Block MDP (EX-BMDP), for rich observation RL.

Reinforcement Learning (RL) Representation Learning

Paper
Add Code

Provably Filtering Exogenous Distractors using Multistep Inverse Dynamics

no code implementations • ICLR 2022 • Yonathan Efroni, Dipendra Misra, Akshay Krishnamurthy, Alekh Agarwal, John Langford

We initiate the formal study of latent state discovery in the presence of such exogenous noise sources by proposing a new model, the Exogenous Block MDP (EX-BMDP), for rich observation RL.

Reinforcement Learning (RL) Representation Learning

Paper
Add Code

Investigating the Role of Negatives in Contrastive Representation Learning

no code implementations • 18 Jun 2021 • Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Dipendra Misra

We focus on disambiguating the role of one of these parameters: the number of negative examples.

Contrastive Learning Data Augmentation +3

Paper
Add Code

Have you tried Neural Topic Models? Comparative Analysis of Neural and Non-Neural Topic Models with Application to COVID-19 Twitter Data

no code implementations • 21 May 2021 • Andrew Bennett, Dipendra Misra, Nga Than

Topic models are widely used in studying social phenomena.

Topic Models

Paper
Add Code

Interactive Learning from Activity Description

1 code implementation • 13 Feb 2021 • Khanh Nguyen, Dipendra Misra, Robert Schapire, Miro Dudík, Patrick Shafto

We present a novel interactive learning protocol that enables training request-fulfilling agents by verbally describing their activities.

General Reinforcement Learning Grounded language learning +2

Paper
Code

Provable Rich Observation Reinforcement Learning with Combinatorial Latent States

no code implementations • ICLR 2021 • Dipendra Misra, Qinghua Liu, Chi Jin, John Langford

We propose a novel setting for reinforcement learning that combines two common real-world difficulties: presence of observations (such as camera images) and factored states (such as location of objects).

Contrastive Learning reinforcement-learning +1

Paper
Add Code

Learning the Linear Quadratic Regulator from Nonlinear Observations

no code implementations • NeurIPS 2020 • Zakaria Mhammedi, Dylan J. Foster, Max Simchowitz, Dipendra Misra, Wen Sun, Akshay Krishnamurthy, Alexander Rakhlin, John Langford

We introduce a new algorithm, RichID, which learns a near-optimal policy for the RichLQR with sample complexity scaling only with the dimension of the latent state space and the capacity of the decoder function class.

Continuous Control

Paper
Add Code

Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning

no code implementations • ICML 2020 • Dipendra Misra, Mikael Henaff, Akshay Krishnamurthy, John Langford

We present an algorithm, HOMER, for exploration and reinforcement learning in rich observation environments that are summarizable by an unknown latent state space.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Combating the Compounding-Error Problem with a Multi-step Model

no code implementations • 30 May 2019 • Kavosh Asadi, Dipendra Misra, Seungchan Kim, Michel L. Littman

In this paper, we address the compounding-error problem by introducing a multi-step model that directly outputs the outcome of executing a sequence of actions.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments

4 code implementations • CVPR 2019 • Howard Chen, Alane Suhr, Dipendra Misra, Noah Snavely, Yoav Artzi

We study the problem of jointly reasoning about language and vision through a navigation and spatial reasoning task.

Ranked #10 on Vision and Language Navigation on Touchdown Dataset

Position Vision and Language Navigation

Paper
Code

Early Fusion for Goal Directed Robotic Vision

no code implementations • 21 Nov 2018 • Aaron Walsman, Yonatan Bisk, Saadia Gabriel, Dipendra Misra, Yoav Artzi, Yejin Choi, Dieter Fox

Building perceptual systems for robotics which perform well under tight computational budgets requires novel architectures which rethink the traditional computer vision pipeline.

Imitation Learning Retrieval

Paper
Add Code

Mapping Navigation Instructions to Continuous Control Actions with Position-Visitation Prediction

1 code implementation • 10 Nov 2018 • Valts Blukis, Dipendra Misra, Ross A. Knepper, Yoav Artzi

We propose an approach for mapping natural language instructions and raw observations to continuous control of a quadcopter drone.

Continuous Control Imitation Learning +2

Paper
Code

Towards a Simple Approach to Multi-step Model-based Reinforcement Learning

no code implementations • 31 Oct 2018 • Kavosh Asadi, Evan Cater, Dipendra Misra, Michael L. Littman

When environmental interaction is expensive, model-based reinforcement learning offers a solution by planning ahead and avoiding costly mistakes.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Policy Shaping and Generalized Update Equations for Semantic Parsing from Denotations

no code implementations • EMNLP 2018 • Dipendra Misra, Ming-Wei Chang, Xiaodong He, Wen-tau Yih

Semantic parsing from denotations faces two key challenges in model training: (1) given only the denotations (e. g., answers), search for good candidate semantic parses, and (2) choose the best model update algorithm.

Question Answering Semantic Parsing

Paper
Add Code

Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction

5 code implementations • EMNLP 2018 • Dipendra Misra, Andrew Bennett, Valts Blukis, Eyvind Niklasson, Max Shatkhin, Yoav Artzi

We propose to decompose instruction execution to goal prediction and action generation.

Action Generation Conditional Image Generation +1

Paper
Code

Equivalence Between Wasserstein and Value-Aware Loss for Model-based Reinforcement Learning

no code implementations • 1 Jun 2018 • Kavosh Asadi, Evan Cater, Dipendra Misra, Michael L. Littman

Learning a generative model is a key component of model-based reinforcement learning.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Lipschitz Continuity in Model-based Reinforcement Learning

1 code implementation • ICML 2018 • Kavosh Asadi, Dipendra Misra, Michael L. Littman

We go on to prove an error bound for the value-function estimate arising from Lipschitz models and show that the estimated value function is itself Lipschitz.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Code

CHALET: Cornell House Agent Learning Environment

2 code implementations • 23 Jan 2018 • Claudia Yan, Dipendra Misra, Andrew Bennnett, Aaron Walsman, Yonatan Bisk, Yoav Artzi

We present CHALET, a 3D house simulator with support for navigation and manipulation.

Paper
Code

Mapping Instructions and Visual Observations to Actions with Reinforcement Learning

1 code implementation • EMNLP 2017 • Dipendra Misra, John Langford, Yoav Artzi

We propose to directly map raw visual observations and text input to actions for instruction execution.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.