Search Results for author: Emma Brunskill

Found 79 papers, 23 papers with code

Evaluating and Optimizing Educational Content with Large Language Model Judgments

no code implementations • 5 Mar 2024 • Joy He-Yueya, Noah D. Goodman, Emma Brunskill

We propose an alternative approach that uses Language Models (LMs) as educational experts to assess the impact of various instructions on learning outcomes.

Language Modelling Large Language Model +1

Paper
Add Code

Experiment Planning with Function Approximation

no code implementations • NeurIPS 2023 • Aldo Pacchiano, Jonathan N. Lee, Emma Brunskill

We study the problem of experiment planning with function approximation in contextual bandit problems.

Model Selection

Paper
Add Code

Adaptive Instrument Design for Indirect Experiments

no code implementations • 5 Dec 2023 • Yash Chandak, Shiv Shankar, Vasilis Syrgkanis, Emma Brunskill

Indirect experiments provide a valuable framework for estimating treatment effects in situations where conducting randomized control trials (RCTs) is impractical or unethical.

Paper
Add Code

Adaptive Interventions with User-Defined Goals for Health Behavior Change

no code implementations • 16 Nov 2023 • Aishwarya Mandyam, Matthew Jörke, Barbara E. Engelhardt, Emma Brunskill

We prove that our modification incurs only a constant penalty on the cumulative regret while preserving the sample complexity benefits of data sharing.

Thompson Sampling

Paper
Add Code

Reinforcement Learning Tutor Better Supported Lower Performers in a Math Task

no code implementations • 11 Apr 2023 • Sherry Ruan, Allen Nie, William Steenbergen, Jiayu He, JQ Zhang, Meng Guo, Yao Liu, Kyle Dang Nguyen, Catherine Y Wang, Rui Ying, James A Landay, Emma Brunskill

Resource limitations make it hard to provide all students with one of the most effective educational interventions: personalized instruction.

Explainable artificial intelligence Math +1

Paper
Add Code

Estimating Optimal Policy Value in General Linear Contextual Bandits

no code implementations • 19 Feb 2023 • Jonathan N. Lee, Weihao Kong, Aldo Pacchiano, Vidya Muthukumar, Emma Brunskill

Whether this is possible for more realistic context distributions has remained an open and important question for tasks such as model selection.

Model Selection Multi-Armed Bandits

Paper
Add Code

Model-based Offline Reinforcement Learning with Local Misspecification

no code implementations • 26 Jan 2023 • Kefan Dong, Yannis Flet-Berliac, Allen Nie, Emma Brunskill

We present a model-based offline reinforcement learning policy performance lower bound that explicitly captures dynamics model misspecification and distribution mismatch and we propose an empirical algorithm for optimal offline policy selection.

D4RL reinforcement-learning +1

Paper
Add Code

Giving Feedback on Interactive Student Programs with Meta-Exploration

1 code implementation • 16 Nov 2022 • Evan Zheran Liu, Moritz Stephan, Allen Nie, Chris Piech, Emma Brunskill, Chelsea Finn

However, teaching and giving feedback on such software is time-consuming -- standard approaches require instructors to manually grade student-implemented interactive programs.

Paper
Code

Oracle Inequalities for Model Selection in Offline Reinforcement Learning

no code implementations • 3 Nov 2022 • Jonathan N. Lee, George Tucker, Ofir Nachum, Bo Dai, Emma Brunskill

We propose the first model selection algorithm for offline RL that achieves minimax rate-optimal oracle inequalities up to logarithmic factors.

Model Selection Offline RL +2

Paper
Add Code

Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data

no code implementations • 16 Oct 2022 • Allen Nie, Yannis Flet-Berliac, Deon R. Jordan, William Steenbergen, Emma Brunskill

Inspired by statistical model selection methods for supervised learning, we introduce a task- and method-agnostic pipeline for automatically training, comparing, selecting, and deploying the best policy when the provided dataset is limited in size.

Model Selection Offline RL +2

Paper
Add Code

Offline Policy Optimization with Eligible Actions

1 code implementation • 1 Jul 2022 • Yao Liu, Yannis Flet-Berliac, Emma Brunskill

Offline policy optimization could have a large impact on many real-world decision-making problems, as online learning may be infeasible in many applications.

Continuous Control Decision Making

Paper
Code

Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning

1 code implementation • 30 Dec 2021 • Tong Mu, Georgios Theocharous, David Arbour, Emma Brunskill

Online reinforcement learning (RL) algorithms are often difficult to deploy in complex human-facing applications as they may learn slowly and have poor early performance.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Reinforcement Learning with State Observation Costs in Action-Contingent Noiselessly Observable Markov Decision Processes

1 code implementation • NeurIPS 2021 • HyunJi Nam, Scott Fleming, Emma Brunskill

Many real-world problems that require making optimal sequences of decisions under uncertainty involve costs when the agent wishes to obtain information about its environment.

Reinforcement Learning (RL)

Paper
Code

Identification of Subgroups With Similar Benefits in Off-Policy Policy Evaluation

no code implementations • 28 Nov 2021 • Ramtin Keramati, Omer Gottesman, Leo Anthony Celi, Finale Doshi-Velez, Emma Brunskill

Off-policy policy evaluation methods for sequential decision making can be used to help identify if a proposed decision policy is better than a current baseline policy.

Decision Making

Paper
Add Code

Evaluating Treatment Prioritization Rules via Rank-Weighted Average Treatment Effects

1 code implementation • 15 Nov 2021 • Steve Yadlowsky, Scott Fleming, Nigam Shah, Emma Brunskill, Stefan Wager

We propose rank-weighted average treatment effect (RATE) metrics as a simple and general family of metrics for comparing and testing the quality of treatment prioritization rules.

Marketing

Paper
Code

Play to Grade: Testing Coding Games as Classifying Markov Decision Process

1 code implementation • NeurIPS 2021 • Allen Nie, Emma Brunskill, Chris Piech

Contemporary coding education often presents students with the task of developing programs that have user interaction and complex dynamic systems, such as mouse based games.

Paper
Code

Avoiding Overfitting to the Importance Weights in Offline Policy Optimization

no code implementations • 29 Sep 2021 • Yao Liu, Emma Brunskill

Offline policy optimization has a critical impact on many real-world decision-making problems, as online learning is costly and concerning in many applications.

Decision Making

Paper
Add Code

Learning to be Fair: A Consequentialist Approach to Equitable Decision-Making

1 code implementation • 18 Sep 2021 • Alex Chohlas-Wood, Madison Coots, Henry Zhu, Emma Brunskill, Sharad Goel

In our approach, one first elicits stakeholder preferences over the space of possible decisions and the resulting outcomes--such as preferences for balancing spending parity against court appearance rates.

Decision Making Fairness

Paper
Code

Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning

no code implementations • NeurIPS 2021 • Andrea Zanette, Martin J. Wainwright, Emma Brunskill

Actor-critic methods are widely used in offline reinforcement learning practice, but are not so well-understood theoretically.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

On the Opportunities and Risks of Foundation Models

2 code implementations • 16 Aug 2021 • Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang

AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.

Transfer Learning

847

Paper
Code

Design of Experiments for Stochastic Contextual Linear Bandits

no code implementations • NeurIPS 2021 • Andrea Zanette, Kefan Dong, Jonathan Lee, Emma Brunskill

In the stochastic linear contextual bandit setting there exist several minimax procedures for exploration with policies that are reactive to the data being acquired.

Paper
Add Code

Universal Off-Policy Evaluation

1 code implementation • NeurIPS 2021 • Yash Chandak, Scott Niekum, Bruno Castro da Silva, Erik Learned-Miller, Emma Brunskill, Philip S. Thomas

When faced with sequential decision-making problems, it is often useful to be able to predict what would happen if decisions were made using a new policy.

counterfactual Decision Making +1

Paper
Code

Play to Grade: Grading Interactive Coding Games as Classifying Markov Decision Process

no code implementations • 1 Jan 2021 • Allen Nie, Emma Brunskill, Chris Piech

Contemporary coding education often present students with the task of developing programs that have user interaction and complex dynamic systems, such as mouse based games.

Paper
Add Code

Provably Good Batch Off-Policy Reinforcement Learning Without Great Exploration

no code implementations • NeurIPS 2020 • Yao Liu, Adith Swaminathan, Alekh Agarwal, Emma Brunskill

Doing batch RL in a way that yields a reliable new policy in large domains is challenging: a new decision policy may visit states and actions outside the support of the batch data, and function approximation and optimization with limited samples can further increase the potential of learning policies with overly optimistic estimates of their future performance.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Online Model Selection for Reinforcement Learning with Function Approximation

no code implementations • 19 Nov 2020 • Jonathan N. Lee, Aldo Pacchiano, Vidya Muthukumar, Weihao Kong, Emma Brunskill

Towards this end, we consider the problem of model selection in RL with function approximation, given a set of candidate RL algorithms with known regret guarantees.

Model Selection reinforcement-learning +1

Paper
Add Code

Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration

no code implementations • NeurIPS 2020 • Andrea Zanette, Alessandro Lazaric, Mykel J. Kochenderfer, Emma Brunskill

There has been growing progress on theoretical analyses for provably efficient learning in MDPs with linear function approximation, but much of the existing work has made strong assumptions to enable exploration by conventional exploration frameworks.

Paper
Add Code

Provably Good Batch Reinforcement Learning Without Great Exploration

1 code implementation • 16 Jul 2020 • Yao Liu, Adith Swaminathan, Alekh Agarwal, Emma Brunskill

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Learning Abstract Models for Strategic Exploration and Fast Reward Transfer

1 code implementation • 12 Jul 2020 • Evan Zheran Liu, Ramtin Keramati, Sudarshan Seshadri, Kelvin Guu, Panupong Pasupat, Emma Brunskill, Percy Liang

Model-based reinforcement learning (RL) is appealing because (i) it enables planning and thus more strategic exploration, and (ii) by decoupling dynamics from rewards, it enables fast transfer to new reward functions.

Model-based Reinforcement Learning Montezuma's Revenge +2

32,783

Paper
Code

Power Constrained Bandits

1 code implementation • 13 Apr 2020 • Jiayu Yao, Emma Brunskill, Weiwei Pan, Susan Murphy, Finale Doshi-Velez

However, when bandits are deployed in the context of a scientific study -- e. g. a clinical trial to test if a mobile health intervention is effective -- the aim is not only to personalize for an individual, but also to determine, with sufficient statistical power, whether or not the system's intervention is effective.

Decision Making Multi-Armed Bandits

Paper
Code

Value Driven Representation for Human-in-the-Loop Reinforcement Learning

no code implementations • 2 Apr 2020 • Ramtin Keramati, Emma Brunskill

In such systems there is typically an external human system designer that is creating, monitoring and modifying the interactive adaptive system, trying to improve its performance on the target outcomes.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Off-policy Policy Evaluation For Sequential Decisions Under Unobserved Confounding

1 code implementation • NeurIPS 2020 • Hongseok Namkoong, Ramtin Keramati, Steve Yadlowsky, Emma Brunskill

We assess robustness of OPE methods under unobserved confounding by developing worst-case bounds on the performance of an evaluation policy.

Decision Making Management

Paper
Code

Learning Near Optimal Policies with Low Inherent Bellman Error

no code implementations • ICML 2020 • Andrea Zanette, Alessandro Lazaric, Mykel Kochenderfer, Emma Brunskill

This has two important consequences: 1) it shows that exploration is possible using only \emph{batch assumptions} with an algorithm that achieves the optimal statistical rate for the setting we consider, which is more general than prior work on low-rank MDPs 2) the lack of closedness (measured by the inherent Bellman error) is only amplified by $\sqrt{d_t}$ despite working in the online setting.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions

no code implementations • ICML 2020 • Omer Gottesman, Joseph Futoma, Yao Liu, Sonali Parbhoo, Leo Anthony Celi, Emma Brunskill, Finale Doshi-Velez

Off-policy evaluation in reinforcement learning offers the chance of using observational data to improve future outcomes in domains such as healthcare and education, but safe deployment in high stakes settings requires ways of assessing its validity.

Off-policy evaluation reinforcement-learning

Paper
Add Code

Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning

2 code implementations • 31 Jan 2020 • Peter Henderson, Jieru Hu, Joshua Romoff, Emma Brunskill, Dan Jurafsky, Joelle Pineau

Accurate reporting of energy and carbon usage is essential for understanding the potential climate impacts of machine learning research.

BIG-bench Machine Learning reinforcement-learning +1

263

Paper
Code

Sublinear Optimal Policy Value Estimation in Contextual Bandits

no code implementations • 12 Dec 2019 • Weihao Kong, Gregory Valiant, Emma Brunskill

We study the problem of estimating the expected reward of the optimal policy in the stochastic disjoint linear bandit setting.

Multi-Armed Bandits

Paper
Add Code

Limiting Extrapolation in Linear Approximate Value Iteration

no code implementations • NeurIPS 2019 • Andrea Zanette, Alessandro Lazaric, Mykel J. Kochenderfer, Emma Brunskill

We prove that if the features at any state can be represented as a convex combination of features at the anchor points, then errors are propagated linearly over iterations (instead of exponentially) and our method achieves a polynomial sample complexity bound in the horizon and the number of anchor points.

Paper
Add Code

Offline Contextual Bandits with High Probability Fairness Guarantees

1 code implementation • NeurIPS 2019 • Blossom Metevier, Stephen Giguere, Sarah Brockman, Ari Kobren, Yuriy Brun, Emma Brunskill, Philip S. Thomas

We present RobinHood, an ofﬂine contextual bandit algorithm designed to satisfy a broad family of fairness constraints.

Fairness Multi-Armed Bandits +1

Paper
Code

Almost Horizon-Free Structure-Aware Best Policy Identification with a Generative Model

no code implementations • NeurIPS 2019 • Andrea Zanette, Mykel J. Kochenderfer, Emma Brunskill

This paper focuses on the problem of computing an $\epsilon$-optimal policy in a discounted Markov Decision Process (MDP) provided that we can access the reward and transition function through a generative model.

Paper
Add Code

Missingness as Stability: Understanding the Structure of Missingness in Longitudinal EHR data and its Impact on Reinforcement Learning in Healthcare

1 code implementation • 16 Nov 2019 • Scott L. Fleming, Kuhan Jeyapragasan, Tony Duan, Daisy Ding, Saurabh Gombar, Nigam Shah, Emma Brunskill

There is an emerging trend in the reinforcement learning for healthcare literature.

Imputation reinforcement-learning +3

Paper
Code

Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy

no code implementations • 5 Nov 2019 • Ramtin Keramati, Christoph Dann, Alex Tamkin, Emma Brunskill

While maximizing expected return is the goal in most reinforcement learning approaches, risk-sensitive objectives such as conditional value at risk (CVaR) are more suitable for many high-stakes applications.

Paper
Add Code

Problem Dependent Reinforcement Learning Bounds Which Can Identify Bandit Structure in MDPs

no code implementations • ICML 2018 • Andrea Zanette, Emma Brunskill

In order to make good decision under uncertainty an agent must learn from observations.

Multi-Armed Bandits reinforcement-learning +1

Paper
Add Code

Frequentist Regret Bounds for Randomized Least-Squares Value Iteration

2 code implementations • 1 Nov 2019 • Andrea Zanette, David Brandfonbrener, Emma Brunskill, Matteo Pirotta, Alessandro Lazaric

We consider the exploration-exploitation dilemma in finite-horizon reinforcement learning (RL).

Reinforcement Learning (RL)

Paper
Code

Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling

no code implementations • ICML 2020 • Yao Liu, Pierre-Luc Bacon, Emma Brunskill

Surprisingly, we find that in finite horizon MDPs there is no strict variance reduction of per-decision importance sampling or stationary importance sampling, comparing with vanilla importance sampling.

Off-policy evaluation

Paper
Add Code

Directed Exploration for Reinforcement Learning

no code implementations • 18 Jun 2019 • Zhaohan Daniel Guo, Emma Brunskill

Efficient exploration is necessary to achieve good sample efficiency for reinforcement learning in general.

Efficient Exploration reinforcement-learning +1

Paper
Add Code

Learning When-to-Treat Policies

1 code implementation • 23 May 2019 • Xinkun Nie, Emma Brunskill, Stefan Wager

Many applied decision-making problems have a dynamic component: The policymaker needs not only to choose whom to treat, but also when to start which treatment.

Decision Making

Paper
Code

Combining Parametric and Nonparametric Models for Off-Policy Evaluation

no code implementations • 14 May 2019 • Omer Gottesman, Yao Liu, Scott Sussex, Emma Brunskill, Finale Doshi-Velez

We consider a model-based approach to perform batch off-policy evaluation in reinforcement learning.

Off-policy evaluation reinforcement-learning

Paper
Add Code

Learning Abstract Models for Long-Horizon Exploration

no code implementations • ICLR 2019 • Evan Zheran Liu, Ramtin Keramati, Sudarshan Seshadri, Kelvin Guu, Panupong Pasupat, Emma Brunskill, Percy Liang

In our approach, a manager maintains an abstract MDP over a subset of the abstract states, which grows monotonically through targeted exploration (possible due to the abstract MDP).

Atari Games

Paper
Add Code

Learning Procedural Abstractions and Evaluating Discrete Latent Temporal Structure

1 code implementation • ICLR 2019 • Karan Goel, Emma Brunskill

Given a dataset of time-series, the goal is to identify the latent sequence of steps common to them and label each time-series with the temporal extent of these procedural steps.

Clustering Time Series +1

Paper
Code

PLOTS: Procedure Learning from Observations using Subtask Structure

no code implementations • 17 Apr 2019 • Tong Mu, Karan Goel, Emma Brunskill

In many cases an intelligent agent may want to learn how to mimic a single observed demonstrated trajectory.

Procedure Learning

Paper
Add Code

Off-Policy Policy Gradient with State Distribution Correction

no code implementations • 17 Apr 2019 • Yao Liu, Adith Swaminathan, Alekh Agarwal, Emma Brunskill

We study the problem of off-policy policy optimization in Markov decision processes, and develop a novel off-policy policy gradient method.

Paper
Add Code

Separating value functions across time-scales

1 code implementation • 5 Feb 2019 • Joshua Romoff, Peter Henderson, Ahmed Touati, Emma Brunskill, Joelle Pineau, Yann Ollivier

In settings where this bias is unacceptable - where the system must optimize for longer horizons at higher discounts - the target of the value function approximator may increase in variance leading to difficulties in learning.

Reinforcement Learning (RL)

Paper
Code

Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds

no code implementations • 1 Jan 2019 • Andrea Zanette, Emma Brunskill

Strong worst-case performance bounds for episodic reinforcement learning exist but fortunately in practice RL algorithms perform much better than such bounds would predict.

Learning Theory Reinforcement Learning (RL)

Paper
Add Code

Distilling Information from a Flood: A Possibility for the Use of Meta-Analysis and Systematic Review in Machine Learning Research

no code implementations • 3 Dec 2018 • Peter Henderson, Emma Brunskill

The current flood of information in all areas of machine learning research, from computer vision to reinforcement learning, has made it difficult to make aggregate scientific inferences.

BIG-bench Machine Learning Epidemiology

Paper
Add Code

Policy Certificates: Towards Accountable Reinforcement Learning

no code implementations • 7 Nov 2018 • Christoph Dann, Lihong Li, Wei Wei, Emma Brunskill

The performance of a reinforcement learning algorithm can vary drastically during learning because of exploration.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters

no code implementations • 3 Jul 2018 • Aniruddh Raghu, Omer Gottesman, Yao Liu, Matthieu Komorowski, Aldo Faisal, Finale Doshi-Velez, Emma Brunskill

In this work, we consider the problem of estimating a behaviour policy for use in Off-Policy Policy Evaluation (OPE) when the true behaviour policy is unknown.

Paper
Add Code

Decoupling Gradient-Like Learning Rules from Representations

no code implementations • ICML 2018 • Philip Thomas, Christoph Dann, Emma Brunskill

When creating a machine learning system, we must make two decisions: what representation should be used (i. e., what parameterized function should be used) and what learning rule should be used to search through the resulting set of representable functions.

BIG-bench Machine Learning

Paper
Add Code

Fast Exploration with Simplified Models and Approximately Optimistic Planning in Model Based Reinforcement Learning

no code implementations • 1 Jun 2018 • Ramtin Keramati, Jay Whang, Patrick Cho, Emma Brunskill

People seem to build simple models that are easy to learn to support planning and strategic exploration.

Model-based Reinforcement Learning Object +2

Paper
Add Code

Representation Balancing MDPs for Off-Policy Policy Evaluation

1 code implementation • NeurIPS 2018 • Yao Liu, Omer Gottesman, Aniruddh Raghu, Matthieu Komorowski, Aldo Faisal, Finale Doshi-Velez, Emma Brunskill

We study the problem of off-policy policy evaluation (OPPE) in RL.

Paper
Code

When Simple Exploration is Sample Efficient: Identifying Sufficient Conditions for Random Exploration to Yield PAC RL Algorithms

no code implementations • 23 May 2018 • Yao Liu, Emma Brunskill

Efficient exploration is one of the key challenges for reinforcement learning (RL) algorithms.

Efficient Exploration Q-Learning +1

Paper
Add Code

Regret Minimization in MDPs with Options without Prior Knowledge

no code implementations • NeurIPS 2017 • Ronan Fruit, Matteo Pirotta, Alessandro Lazaric, Emma Brunskill

The option framework integrates temporal abstraction into the reinforcement learning model through the introduction of macro-actions (i. e., options).

Paper
Add Code

Generalized Grounding Graphs: A Probabilistic Framework for Understanding Grounded Commands

no code implementations • 29 Nov 2017 • Thomas Kollar, Stefanie Tellex, Matthew Walter, Albert Huang, Abraham Bachrach, Sachi Hemachandra, Emma Brunskill, Ashis Banerjee, Deb Roy, Seth Teller, Nicholas Roy

Symbolic models capture linguistic structure but have not scaled successfully to handle the diverse language produced by untrained users.

Language Acquisition

Paper
Add Code

On Ensuring that Intelligent Machines Are Well-Behaved

no code implementations • 17 Aug 2017 • Philip S. Thomas, Bruno Castro da Silva, Andrew G. Barto, Emma Brunskill

We propose a new framework for designing machine learning algorithms that simplifies the problem of specifying and regulating undesirable behaviors.

BIG-bench Machine Learning

Paper
Add Code

Policy Gradient Methods for Reinforcement Learning with Function Approximation and Action-Dependent Baselines

no code implementations • NeurIPS 1999 • Philip S. Thomas, Emma Brunskill

We show how an action-dependent baseline can be used by the policy gradient theorem using function approximation, originally presented with action-independent baselines by (Sutton et al. 2000).

Policy Gradient Methods reinforcement-learning +1

Paper
Add Code

Decoupling Learning Rules from Representations

no code implementations • 9 Jun 2017 • Philip S. Thomas, Christoph Dann, Emma Brunskill

When creating an artificial intelligence system, we must make two decisions: what representation should be used (i. e., what parameterized function should be used) and what learning rule should be used to search through the resulting set of representable functions.

Paper
Add Code

Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning

1 code implementation • NeurIPS 2017 • Christoph Dann, Tor Lattimore, Emma Brunskill

Statistical performance bounds for reinforcement learning (RL) algorithms can be critical for high-stakes applications like healthcare.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Using Options and Covariance Testing for Long Horizon Off-Policy Policy Evaluation

no code implementations • NeurIPS 2017 • Zhaohan Daniel Guo, Philip S. Thomas, Emma Brunskill

In addition, we can take advantage of special cases that arise due to options-based policies to further improve the performance of importance sampling.

Paper
Add Code

Sample Efficient Feature Selection for Factored MDPs

no code implementations • 9 Mar 2017 • Zhaohan Daniel Guo, Emma Brunskill

This can result in a much better sample complexity when the in-degree of the necessary features is smaller than the in-degree of all features.

feature selection reinforcement-learning +1

Paper
Add Code

Sample Efficient Policy Search for Optimal Stopping Domains

no code implementations • 21 Feb 2017 • Karan Goel, Christoph Dann, Emma Brunskill

Optimal stopping problems consider the question of deciding when to stop an observation-generating process in order to maximize a return.

Paper
Add Code

Importance Sampling with Unequal Support

no code implementations • 10 Nov 2016 • Philip S. Thomas, Emma Brunskill

Importance sampling is often used in machine learning when training and testing data come from different distributions.

Paper
Add Code

A PAC RL Algorithm for Episodic POMDPs

no code implementations • 25 May 2016 • Zhaohan Daniel Guo, Shayan Doroudi, Emma Brunskill

Many interesting real world domains involve reinforcement learning (RL) in partially observable environments.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Latent Contextual Bandits and their Application to Personalized Recommendations for New Users

no code implementations • 22 Apr 2016 • Li Zhou, Emma Brunskill

We consider both the benefit of leveraging a set of learned latent user classes for new users, and how we can learn such latent classes from prior users.

Multi-Armed Bandits

Paper
Add Code

Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning

3 code implementations • 4 Apr 2016 • Philip S. Thomas, Emma Brunskill

In this paper we present a new way of predicting the performance of a reinforcement learning policy given historical data that may have been generated by a different policy.

reinforcement-learning Reinforcement Learning (RL)

3,521

Paper
Code

Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning

no code implementations • NeurIPS 2015 • Christoph Dann, Emma Brunskill

In this paper, we derive an upper PAC bound $\tilde O(\frac{|\mathcal S|^2 |\mathcal A| H^2}{\epsilon^2} \ln\frac 1 \delta)$ and a lower PAC bound $\tilde \Omega(\frac{|\mathcal S| |\mathcal A| H^2}{\epsilon^2} \ln \frac 1 {\delta + c})$ that match up to log-terms and an additional linear dependency on the number of states $|\mathcal S|$.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

The Online Coupon-Collector Problem and Its Application to Lifelong Reinforcement Learning

no code implementations • 10 Jun 2015 • Emma Brunskill, Lihong Li

Transferring knowledge across a sequence of related tasks is an important challenge in reinforcement learning (RL).

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Online Stochastic Optimization under Correlated Bandit Feedback

no code implementations • 4 Feb 2014 • Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill

In this paper we consider the problem of online stochastic optimization of a locally smooth function under bandit feedback.

Stochastic Optimization

Paper
Add Code

Efficient Planning under Uncertainty with Macro-actions

no code implementations • 16 Jan 2014 • Ruijie He, Emma Brunskill, Nicholas Roy

We also demonstrate our algorithm being used to control a real robotic helicopter in a target monitoring experiment, which suggests that our approach has practical potential for planning in real-world, large partially observable domains where a multi-step lookahead is required to achieve good performance.

Paper
Add Code

Sample Complexity of Multi-task Reinforcement Learning

no code implementations • 26 Sep 2013 • Emma Brunskill, Lihong Li

Transferring knowledge across a sequence of reinforcement-learning tasks is challenging, and has a number of important applications.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Sequential Transfer in Multi-armed Bandit with Finite Set of Models

no code implementations • NeurIPS 2013 • Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill

Learning from prior tasks and transferring that experience to improve future performance is critical for building lifelong learning agents.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Regret Bounds for Reinforcement Learning with Policy Advice

no code implementations • 5 May 2013 • Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill

In some reinforcement learning problems an agent may be provided with a set of input policies, perhaps learned from prior experience or provided by advisors.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.