Search Results for author: Philip Amortila

Found 11 papers, 1 papers with code

Scalable Online Exploration via Coverability

1 code implementation11 Mar 2024 Philip Amortila, Dylan J. Foster, Akshay Krishnamurthy

We propose exploration objectives -- policy optimization objectives that enable downstream maximization of any reward function -- as a conceptual framework to systematize the study of exploration.

Efficient Exploration Q-Learning +1

Mitigating Covariate Shift in Misspecified Regression with Applications to Reinforcement Learning

no code implementations22 Jan 2024 Philip Amortila, Tongyi Cao, Akshay Krishnamurthy

A pervasive phenomenon in machine learning applications is distribution shift, where training and deployment conditions for a machine learning model differ.

regression reinforcement-learning

Harnessing Density Ratios for Online Reinforcement Learning

no code implementations18 Jan 2024 Philip Amortila, Dylan J. Foster, Nan Jiang, Ayush Sekhari, Tengyang Xie

The theories of offline and online reinforcement learning, despite having evolved in parallel, have begun to show signs of the possibility for a unification, with algorithms and analysis techniques for one setting often having natural counterparts in the other.

Offline RL reinforcement-learning

The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation

no code implementations25 Jul 2023 Philip Amortila, Nan Jiang, Csaba Szepesvári

Theoretical guarantees in reinforcement learning (RL) are known to suffer multiplicative blow-up factors with respect to the misspecification error of function approximation.

Off-policy evaluation

A Few Expert Queries Suffices for Sample-Efficient RL with Resets and Linear Value Approximation

no code implementations18 Jul 2022 Philip Amortila, Nan Jiang, Dhruv Madeka, Dean P. Foster

Towards establishing the minimal amount of expert queries needed, we show that, in the same setting, any learner whose exploration budget is polynomially-bounded (in terms of $d, H,$ and $|\mathcal{A}|$) will require at least $\tilde\Omega(\sqrt{d})$ oracle calls to recover a policy competing with the expert's value function.

Imitation Learning Reinforcement Learning (RL)

On Query-efficient Planning in MDPs under Linear Realizability of the Optimal State-value Function

no code implementations3 Feb 2021 Gellért Weisz, Philip Amortila, Barnabás Janzer, Yasin Abbasi-Yadkori, Nan Jiang, Csaba Szepesvári

We consider local planning in fixed-horizon MDPs with a generative model under the assumption that the optimal value function lies close to the span of a feature map.

Open-Ended Question Answering

A Variant of the Wang-Foster-Kakade Lower Bound for the Discounted Setting

no code implementations2 Nov 2020 Philip Amortila, Nan Jiang, Tengyang Xie

Recently, Wang et al. (2020) showed a highly intriguing hardness result for batch reinforcement learning (RL) with linearly realizable value function and good feature coverage in the finite-horizon case.

reinforcement-learning Reinforcement Learning (RL)

Exponential Lower Bounds for Planning in MDPs With Linearly-Realizable Optimal Action-Value Functions

no code implementations3 Oct 2020 Gellért Weisz, Philip Amortila, Csaba Szepesvári

We consider the problem of local planning in fixed-horizon and discounted Markov Decision Processes (MDPs) with linear function approximation and a generative model under the assumption that the optimal action-value function lies in the span of a feature map that is available to the planner.

Constrained Markov Decision Processes via Backward Value Functions

no code implementations ICML 2020 Harsh Satija, Philip Amortila, Joelle Pineau

In standard RL, the agent is incentivized to explore any behavior as long as it maximizes rewards, but in the real world, undesired behavior can damage either the system or the agent in a way that breaks the learning process itself.

Reinforcement Learning (RL)

A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms

no code implementations27 Mar 2020 Philip Amortila, Doina Precup, Prakash Panangaden, Marc G. Bellemare

We present a distributional approach to theoretical analyses of reinforcement learning algorithms for constant step-sizes.

Q-Learning reinforcement-learning +1

Learning Graph Weighted Models on Pictures

no code implementations21 Jun 2018 Philip Amortila, Guillaume Rabusseau

Graph Weighted Models (GWMs) have recently been proposed as a natural generalization of weighted automata over strings and trees to arbitrary families of labeled graphs (and hypergraphs).

Cannot find the paper you are looking for? You can Submit a new open access paper.