Search Results for author: Christoph Dann

Found 23 papers, 3 papers with code

A Unified Algorithm for Stochastic Path Problems

no code implementations17 Oct 2022 Christoph Dann, Chen-Yu Wei, Julian Zimmert

Our regret bound matches the best known results for the well-studied special case of stochastic shortest path (SSP) with all non-positive rewards.

Best of Both Worlds Model Selection

no code implementations29 Jun 2022 Aldo Pacchiano, Christoph Dann, Claudio Gentile

We study the problem of model selection in bandit scenarios in the presence of nested policy classes, with the goal of obtaining simultaneous adversarial and stochastic ("best of both worlds") high-probability regret guarantees.

Model Selection

Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation

no code implementations19 Jun 2022 Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

This paper presents a theoretical analysis of such policies and provides the first regret and sample-complexity bounds for reinforcement learning with myopic exploration.

reinforcement-learning

Same Cause; Different Effects in the Brain

1 code implementation21 Feb 2022 Mariya Toneva, Jennifer Williams, Anand Bollu, Christoph Dann, Leila Wehbe

It is then natural to ask: "Is the activity in these different brain zones caused by the stimulus properties in the same way?"

A Model Selection Approach for Corruption Robust Reinforcement Learning

no code implementations7 Oct 2021 Chen-Yu Wei, Christoph Dann, Julian Zimmert

We develop a model selection approach to tackle reinforcement learning with adversarial corruption in both transition and reward.

Model Selection Multi-Armed Bandits +1

Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning

no code implementations NeurIPS 2021 Christoph Dann, Teodor V. Marinov, Mehryar Mohri, Julian Zimmert

Our results show that optimistic algorithms can not achieve the information-theoretic lower bounds even in deterministic MDPs unless there is a unique optimal policy.

reinforcement-learning

Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations

no code implementations NeurIPS 2021 Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

In this work, we consider the more realistic setting of agnostic RL with rich observation spaces and a fixed class of policies $\Pi$ that may not contain any near-optimal policy.

reinforcement-learning

Neural Active Learning with Performance Guarantees

no code implementations NeurIPS 2021 Pranjal Awasthi, Christoph Dann, Claudio Gentile, Ayush Sekhari, Zhilei Wang

We investigate the problem of active learning in the streaming setting in non-parametric regimes, where the labels are stochastically generated from a class of functions on which we make no assumptions whatsoever.

Active Learning Model Selection

Regret Bound Balancing and Elimination for Model Selection in Bandits and RL

no code implementations24 Dec 2020 Aldo Pacchiano, Christoph Dann, Claudio Gentile, Peter Bartlett

Finally, unlike recent efforts in model selection for linear stochastic bandits, our approach is versatile enough to also cover cases where the context information is generated by an adversarial environment, rather than a stochastic one.

Model Selection

Reinforcement Learning with Feedback Graphs

no code implementations NeurIPS 2020 Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

We study episodic reinforcement learning in Markov decision processes when the agent receives additional feedback per step in the form of several transition observations.

reinforcement-learning

Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy

no code implementations5 Nov 2019 Ramtin Keramati, Christoph Dann, Alex Tamkin, Emma Brunskill

While maximizing expected return is the goal in most reinforcement learning approaches, risk-sensitive objectives such as conditional value at risk (CVaR) are more suitable for many high-stakes applications.

Policy Certificates: Towards Accountable Reinforcement Learning

no code implementations7 Nov 2018 Christoph Dann, Lihong Li, Wei Wei, Emma Brunskill

The performance of a reinforcement learning algorithm can vary drastically during learning because of exploration.

reinforcement-learning

Decoupling Gradient-Like Learning Rules from Representations

no code implementations ICML 2018 Philip Thomas, Christoph Dann, Emma Brunskill

When creating a machine learning system, we must make two decisions: what representation should be used (i. e., what parameterized function should be used) and what learning rule should be used to search through the resulting set of representable functions.

BIG-bench Machine Learning

Decoupling Learning Rules from Representations

no code implementations9 Jun 2017 Philip S. Thomas, Christoph Dann, Emma Brunskill

When creating an artificial intelligence system, we must make two decisions: what representation should be used (i. e., what parameterized function should be used) and what learning rule should be used to search through the resulting set of representable functions.

Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning

1 code implementation NeurIPS 2017 Christoph Dann, Tor Lattimore, Emma Brunskill

Statistical performance bounds for reinforcement learning (RL) algorithms can be critical for high-stakes applications like healthcare.

reinforcement-learning

Sample Efficient Policy Search for Optimal Stopping Domains

no code implementations21 Feb 2017 Karan Goel, Christoph Dann, Emma Brunskill

Optimal stopping problems consider the question of deciding when to stop an observation-generating process in order to maximize a return.

Memory Lens: How Much Memory Does an Agent Use?

no code implementations21 Nov 2016 Christoph Dann, Katja Hofmann, Sebastian Nowozin

The study of memory as information that flows from the past to the current action opens avenues to understand and improve successful reinforcement learning algorithms.

reinforcement-learning

Thoughts on Massively Scalable Gaussian Processes

3 code implementations5 Nov 2015 Andrew Gordon Wilson, Christoph Dann, Hannes Nickisch

This multi-level circulant approximation allows one to unify the orthogonal computational benefits of fast Kronecker and Toeplitz approaches, and is significantly faster than either approach in isolation; 2) local kernel interpolation and inducing points to allow for arbitrarily located data inputs, and $O(1)$ test time predictions; 3) exploiting block-Toeplitz Toeplitz-block structure (BTTB), which enables fast inference and learning when multidimensional Kronecker structure is not present; and 4) projections of the input space to flexibly model correlated inputs and high dimensional data.

Gaussian Processes

Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning

no code implementations NeurIPS 2015 Christoph Dann, Emma Brunskill

In this paper, we derive an upper PAC bound $\tilde O(\frac{|\mathcal S|^2 |\mathcal A| H^2}{\epsilon^2} \ln\frac 1 \delta)$ and a lower PAC bound $\tilde \Omega(\frac{|\mathcal S| |\mathcal A| H^2}{\epsilon^2} \ln \frac 1 {\delta + c})$ that match up to log-terms and an additional linear dependency on the number of states $|\mathcal S|$.

reinforcement-learning

The Human Kernel

no code implementations NeurIPS 2015 Andrew Gordon Wilson, Christoph Dann, Christopher G. Lucas, Eric P. Xing

Bayesian nonparametric models, such as Gaussian processes, provide a compelling framework for automatic statistical modelling: these models have a high degree of flexibility, and automatically calibrated complexity.

Gaussian Processes

Bayesian Time-of-Flight for Realtime Shape, Illumination and Albedo

no code implementations22 Jul 2015 Amit Adam, Christoph Dann, Omer Yair, Shai Mazor, Sebastian Nowozin

We propose a computational model for shape, illumination and albedo inference in a pulsed time-of-flight (TOF) camera.

Cannot find the paper you are looking for? You can Submit a new open access paper.