Search Results for author: J. Andrew Bagnell

Found 46 papers, 17 papers with code

Of Moments and Matching: A Game-Theoretic Framework for Closing the Imitation Gap

2 code implementations • 4 Mar 2021 • Gokul Swamy, Sanjiban Choudhury, J. Andrew Bagnell, Zhiwei Steven Wu

We provide a unifying view of a large family of previous imitation learning algorithms through the lens of moment matching.

Imitation Learning

386

Paper
Code

A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning

3 code implementations • 2 Nov 2010 • Stephane Ross, Geoffrey J. Gordon, J. Andrew Bagnell

Sequential prediction problems such as imitation learning, where future observations depend on previous predictions (actions), violate the common i. i. d.

Imitation Learning Structured Prediction

243

Paper
Code

Inverse Reinforcement Learning without Reinforcement Learning

1 code implementation • 26 Mar 2023 • Gokul Swamy, Sanjiban Choudhury, J. Andrew Bagnell, Zhiwei Steven Wu

In this work, we demonstrate for the first time a more informed imitation learning reduction where we utilize the state distribution of the expert to alleviate the global exploration component of the RL subroutine, providing an exponential speedup in theory.

Continuous Control Imitation Learning +2

Paper
Code

Provably Efficient Imitation Learning from Observation Alone

1 code implementation • 27 May 2019 • Wen Sun, Anirudh Vemula, Byron Boots, J. Andrew Bagnell

We design a new model-free algorithm for ILFO, Forward Adversarial Imitation Learning (FAIL), which learns a sequence of time-dependent policies by minimizing an Integral Probability Metric between the observation distributions of the expert policy and the learner.

Imitation Learning OpenAI Gym

Paper
Code

Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient

1 code implementation • 13 Oct 2022 • Yuda Song, Yifei Zhou, Ayush Sekhari, J. Andrew Bagnell, Akshay Krishnamurthy, Wen Sun

We consider a hybrid reinforcement learning setting (Hybrid RL), in which an agent has access to an offline dataset and the ability to collect experience via real-world online interaction.

Montezuma's Revenge Q-Learning

Paper
Code

TRON: A Fast Solver for Trajectory Optimization with Non-Smooth Cost Functions

1 code implementation • 31 Mar 2020 • Anirudh Vemula, J. Andrew Bagnell

TRON achieves this by exploiting the structure of the objective to adaptively smooth the cost function, resulting in a sequence of objectives that can be efficiently optimized.

Robotics Systems and Control Systems and Control

Paper
Code

Exploration in Action Space

1 code implementation • 31 Mar 2020 • Anirudh Vemula, Wen Sun, J. Andrew Bagnell

Parameter space exploration methods with black-box optimization have recently been shown to outperform state-of-the-art approaches in continuous control reinforcement learning domains.

Continuous Control reinforcement-learning +1

Paper
Code

Causal Imitation Learning under Temporally Correlated Noise

1 code implementation • 2 Feb 2022 • Gokul Swamy, Sanjiban Choudhury, J. Andrew Bagnell, Zhiwei Steven Wu

We develop algorithms for imitation learning from policy data that was corrupted by temporally correlated noise in expert actions.

Econometrics Imitation Learning

Paper
Code

Planning and Execution using Inaccurate Models with Provable Guarantees

1 code implementation • 9 Mar 2020 • Anirudh Vemula, Yash Oza, J. Andrew Bagnell, Maxim Likhachev

In this paper, we propose CMAX an approach for interleaving planning and execution.

Paper
Code

Sequence Model Imitation Learning with Unobserved Contexts

1 code implementation • 3 Aug 2022 • Gokul Swamy, Sanjiban Choudhury, J. Andrew Bagnell, Zhiwei Steven Wu

We consider imitation learning problems where the learner's ability to mimic the expert increases throughout the course of an episode as more information is revealed.

Continuous Control Imitation Learning

Paper
Code

Contrasting Exploration in Parameter and Action Space: A Zeroth-Order Optimization Perspective

1 code implementation • 31 Jan 2019 • Anirudh Vemula, Wen Sun, J. Andrew Bagnell

Black-box optimizers that explore in parameter space have often been shown to outperform more sophisticated action space exploration methods developed specifically for the reinforcement learning problem.

Continuous Control regression +2

Paper
Code

CMAX++ : Leveraging Experience in Planning and Execution using Inaccurate Models

1 code implementation • 21 Sep 2020 • Anirudh Vemula, J. Andrew Bagnell, Maxim Likhachev

In this paper we propose CMAX++, an approach that leverages real-world experience to improve the quality of resulting plans over successive repetitions of a robotic task.

Friction Robot Navigation

Paper
Code

Minimax Optimal Online Imitation Learning via Replay Estimation

1 code implementation • 30 May 2022 • Gokul Swamy, Nived Rajaraman, Matthew Peng, Sanjiban Choudhury, J. Andrew Bagnell, Zhiwei Steven Wu, Jiantao Jiao, Kannan Ramchandran

In the tabular setting or with linear function approximation, our meta theorem shows that the performance gap incurred by our approach achieves the optimal $\widetilde{O} \left( \min({H^{3/2}} / {N}, {H} / {\sqrt{N}} \right)$ dependency, under significantly weaker assumptions compared to prior work.

Continuous Control Imitation Learning

Paper
Code

Hybrid Inverse Reinforcement Learning

1 code implementation • 13 Feb 2024 • Juntao Ren, Gokul Swamy, Zhiwei Steven Wu, J. Andrew Bagnell, Sanjiban Choudhury

In this work, we propose using hybrid RL -- training on a mixture of online and expert data -- to curtail unnecessary exploration.

Continuous Control Imitation Learning +2

Paper
Code

Log-DenseNet: How to Sparsify a DenseNet

1 code implementation • ICLR 2018 • Hanzhang Hu, Debadeepta Dey, Allison Del Giorno, Martial Hebert, J. Andrew Bagnell

Skip connections are increasingly utilized by deep neural networks to improve accuracy and cost-efficiency.

Semantic Segmentation

Paper
Code

Truncated Horizon Policy Search: Combining Reinforcement Learning & Imitation Learning

no code implementations • ICLR 2018 • Wen Sun, J. Andrew Bagnell, Byron Boots

In this paper, we propose to combine imitation and reinforcement learning via the idea of reward shaping using an oracle.

Imitation Learning reinforcement-learning +1

Paper
Add Code

Dual Policy Iteration

no code implementations • NeurIPS 2018 • Wen Sun, Geoffrey J. Gordon, Byron Boots, J. Andrew Bagnell

Recently, a novel class of Approximate Policy Iteration (API) algorithms have demonstrated impressive practical performance (e. g., ExIt from [2], AlphaGo-Zero from [27]).

Continuous Control

Paper
Add Code

Learning Anytime Predictions in Neural Networks via Adaptive Loss Balancing

no code implementations • 22 Aug 2017 • Hanzhang Hu, Debadeepta Dey, Martial Hebert, J. Andrew Bagnell

Experimentally, the adaptive weights induce more competitive anytime predictions on multiple recognition data-sets and models than non-adaptive approaches including weighing all losses equally.

Paper
Add Code

Predictive-State Decoders: Encoding the Future into Recurrent Networks

no code implementations • NeurIPS 2017 • Arun Venkatraman, Nicholas Rhinehart, Wen Sun, Lerrel Pinto, Martial Hebert, Byron Boots, Kris M. Kitani, J. Andrew Bagnell

We seek to combine the advantages of RNNs and PSRs by augmenting existing state-of-the-art recurrent neural networks with Predictive-State Decoders (PSDs), which add supervision to the network's internal state representation to target predicting future observations.

Imitation Learning

Paper
Add Code

Ignoring Distractors in the Absence of Labels: Optimal Linear Projection to Remove False Positives During Anomaly Detection

no code implementations • 13 Sep 2017 • Allison Del Giorno, J. Andrew Bagnell, Martial Hebert

We demonstrate that this method is able to remove uninformative parts of the feature space for the anomaly detection setting.

Anomaly Detection Descriptive

Paper
Add Code

Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction

no code implementations • ICML 2017 • Wen Sun, Arun Venkatraman, Geoffrey J. Gordon, Byron Boots, J. Andrew Bagnell

We demonstrate that AggreVaTeD --- a policy gradient extension of the Imitation Learning (IL) approach of (Ross & Bagnell, 2014) --- can leverage such an oracle to achieve faster and better solutions with less training data than a less-informed Reinforcement Learning (RL) technique.

Dependency Parsing Imitation Learning +1

Paper
Add Code

Gradient Boosting on Stochastic Data Streams

no code implementations • 1 Mar 2017 • Hanzhang Hu, Wen Sun, Arun Venkatraman, Martial Hebert, J. Andrew Bagnell

To generalize from batch to online, we first introduce the definition of online weak learning edge with which for strongly convex and smooth loss functions, we present an algorithm, Streaming Gradient Boosting (SGB) with exponential shrinkage guarantees in the number of weak learners.

Paper
Add Code

Efficient Feature Group Sequencing for Anytime Linear Prediction

no code implementations • 19 Sep 2014 • Hanzhang Hu, Alexander Grubb, J. Andrew Bagnell, Martial Hebert

We theoretically guarantee that our algorithms achieve near-optimal linear predictions at each budget when a feature group is chosen.

Paper
Add Code

A Discriminative Framework for Anomaly Detection in Large Videos

no code implementations • 28 Sep 2016 • Allison Del Giorno, J. Andrew Bagnell, Martial Hebert

We address an anomaly detection setting in which training sequences are unavailable and anomalies are scored independently of temporal ordering.

Anomaly Detection Density Estimation

Paper
Add Code

Learning Transferable Policies for Monocular Reactive MAV Control

no code implementations • 1 Aug 2016 • Shreyansh Daftry, J. Andrew Bagnell, Martial Hebert

The ability to transfer knowledge gained in previous tasks into new contexts is one of the most important mechanisms of human learning.

Paper
Add Code

Introspective Perception: Learning to Predict Failures in Vision Systems

no code implementations • 28 Jul 2016 • Shreyansh Daftry, Sam Zeng, J. Andrew Bagnell, Martial Hebert

As robots aspire for long-term autonomous operations in complex dynamic environments, the ability to reliably take mission-critical decisions in ambiguous situations becomes critical.

Paper
Add Code

Learning to Filter with Predictive State Inference Machines

no code implementations • 30 Dec 2015 • Wen Sun, Arun Venkatraman, Byron Boots, J. Andrew Bagnell

Latent state space models are a fundamental and widely used tool for modeling dynamical systems.

Paper
Add Code

Visual Chunking: A List Prediction Framework for Region-Based Object Detection

no code implementations • 27 Oct 2014 • Nicholas Rhinehart, Jiaji Zhou, Martial Hebert, J. Andrew Bagnell

We present an efficient algorithm with provable performance for building a high-quality list of detections from any candidate set of region-based proposals.

Chunking object-detection +2

Paper
Add Code

Solving Games with Functional Regret Estimation

no code implementations • 28 Nov 2014 • Kevin Waugh, Dustin Morrill, J. Andrew Bagnell, Michael Bowling

We propose a novel online learning method for minimizing regret in large extensive-form games.

Paper
Add Code

Vision and Learning for Deliberative Monocular Cluttered Flight

no code implementations • 24 Nov 2014 • Debadeepta Dey, Kumar Shaurya Shankar, Sam Zeng, Rupesh Mehta, M. Talha Agcayazi, Christopher Eriksen, Shreyansh Daftry, Martial Hebert, J. Andrew Bagnell

Cameras provide a rich source of information while being passive, cheap and lightweight for small and medium Unmanned Aerial Vehicles (UAVs).

Depth Estimation Depth Prediction +1

Paper
Add Code

A Unified View of Large-scale Zero-sum Equilibrium Computation

no code implementations • 18 Nov 2014 • Kevin Waugh, J. Andrew Bagnell

The task of computing approximate Nash equilibria in large zero-sum extensive-form games has received a tremendous amount of attention due mainly to the Annual Computer Poker Competition.

Paper
Add Code

Reinforcement and Imitation Learning via Interactive No-Regret Learning

no code implementations • 23 Jun 2014 • Stephane Ross, J. Andrew Bagnell

Recent work has demonstrated that problems-- particularly imitation learning and structured prediction-- where a learner's predictions influence the input-distribution it is tested on can be naturally addressed by an interactive approach and analyzed using no-regret online learning.

Imitation Learning reinforcement-learning +2

Paper
Add Code

Knapsack Constrained Contextual Submodular List Prediction with Application to Multi-document Summarization

no code implementations • 16 Aug 2013 • Jiaji Zhou, Stephane Ross, Yisong Yue, Debadeepta Dey, J. Andrew Bagnell

We study the problem of predicting a set or list of options under knapsack constraint.

Document Summarization Multi-Document Summarization

Paper
Add Code

Near Optimal Bayesian Active Learning for Decision Making

no code implementations • 24 Feb 2014 • Shervin Javdani, Yuxin Chen, Amin Karbasi, Andreas Krause, J. Andrew Bagnell, Siddhartha Srinivasa

Instead of minimizing uncertainty per se, we consider a set of overlapping decision regions of these hypotheses.

Active Learning Decision Making +1

Paper
Add Code

SpeedMachines: Anytime Structured Prediction

no code implementations • 2 Dec 2013 • Alexander Grubb, Daniel Munoz, J. Andrew Bagnell, Martial Hebert

Structured prediction plays a central role in machine learning applications from computational biology to computer vision.

General Classification Scene Understanding +1

Paper
Add Code

Computational Rationalization: The Inverse Equilibrium Problem

no code implementations • 15 Aug 2013 • Kevin Waugh, Brian D. Ziebart, J. Andrew Bagnell

Modeling the purposeful behavior of imperfect agents from a small number of observations is a challenging task.

Paper
Add Code

Learning Policies for Contextual Submodular Prediction

no code implementations • 11 May 2013 • Stephane Ross, Jiaji Zhou, Yisong Yue, Debadeepta Dey, J. Andrew Bagnell

Many prediction domains, such as ad placement, recommendation, trajectory prediction, and document summarization, require predicting a set or list of options.

Document Summarization News Recommendation +1

Paper
Add Code

An Algorithmic Perspective on Imitation Learning

no code implementations • 16 Nov 2018 • Takayuki Osa, Joni Pajarinen, Gerhard Neumann, J. Andrew Bagnell, Pieter Abbeel, Jan Peters

This process of learning from demonstrations, and the study of algorithms to do so, is called imitation learning.

Imitation Learning Learning Theory

Paper
Add Code

Anytime Neural Network: a Versatile Trade-off Between Computation and Accuracy

no code implementations • ICLR 2018 • Hanzhang Hu, Debadeepta Dey, Martial Hebert, J. Andrew Bagnell

We present an approach for anytime predictions in deep neural networks (DNNs).

Paper
Add Code

Predicting Multiple Structured Visual Interpretations

no code implementations • ICCV 2015 • Debadeepta Dey, Varun Ramakrishna, Martial Hebert, J. Andrew Bagnell

We present a simple approach for producing a small number of structured visual outputs which have high recall, for a variety of tasks including monocular pose estimation and semantic scene segmentation.

Pose Estimation Scene Segmentation +1

Paper
Add Code

Feedback in Imitation Learning: The Three Regimes of Covariate Shift

no code implementations • 4 Feb 2021 • Jonathan Spencer, Sanjiban Choudhury, Arun Venkatraman, Brian Ziebart, J. Andrew Bagnell

The learner often comes to rely on features that are strongly predictive of decisions, but are subject to strong covariate shift.

Causal Inference Imitation Learning

Paper
Add Code

A Critique of Strictly Batch Imitation Learning

no code implementations • 5 Oct 2021 • Gokul Swamy, Sanjiban Choudhury, J. Andrew Bagnell, Zhiwei Steven Wu

Recent work by Jarrett et al. attempts to frame the problem of offline imitation learning (IL) as one of learning a joint energy-based model, with the hope of out-performing standard behavioral cloning.

Imitation Learning

Paper
Add Code

On the Effectiveness of Iterative Learning Control

1 code implementation • 17 Nov 2021 • Anirudh Vemula, Wen Sun, Maxim Likhachev, J. Andrew Bagnell

However, there is little prior theoretical work that explains the effectiveness of ILC even in the presence of large modeling errors, where optimal control methods using the misspecified model (MM) often perform poorly.

Industrial Robots

Paper
Code

Game-Theoretic Algorithms for Conditional Moment Matching

no code implementations • 19 Aug 2022 • Gokul Swamy, Sanjiban Choudhury, J. Andrew Bagnell, Zhiwei Steven Wu

A variety of problems in econometrics and machine learning, including instrumental variable regression and Bellman residual minimization, can be formulated as satisfying a set of conditional moment restrictions (CMR).

Econometrics regression

Paper
Add Code

The Virtues of Laziness in Model-based RL: A Unified Objective and Algorithms

1 code implementation • 1 Mar 2023 • Anirudh Vemula, Yuda Song, Aarti Singh, J. Andrew Bagnell, Sanjiban Choudhury

We propose a novel approach to addressing two fundamental challenges in Model-based Reinforcement Learning (MBRL): the computational expense of repeatedly finding a good policy in the learned model, and the objective mismatch between model fitting and policy computation.

Computational Efficiency Model-based Reinforcement Learning

Paper
Code

The Virtues of Pessimism in Inverse Reinforcement Learning

no code implementations • 4 Feb 2024 • David Wu, Gokul Swamy, J. Andrew Bagnell, Zhiwei Steven Wu, Sanjiban Choudhury

Inverse Reinforcement Learning (IRL) is a powerful framework for learning complex behaviors from expert demonstrations.

Offline RL reinforcement-learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.