Search Results for author: Philip Amortila

Found 11 papers, 1 papers with code

Scalable Online Exploration via Coverability

1 code implementation • 11 Mar 2024 • Philip Amortila, Dylan J. Foster, Akshay Krishnamurthy

We propose exploration objectives -- policy optimization objectives that enable downstream maximization of any reward function -- as a conceptual framework to systematize the study of exploration.

Efficient Exploration Q-Learning +1

Paper
Code

Mitigating Covariate Shift in Misspecified Regression with Applications to Reinforcement Learning

no code implementations • 22 Jan 2024 • Philip Amortila, Tongyi Cao, Akshay Krishnamurthy

A pervasive phenomenon in machine learning applications is distribution shift, where training and deployment conditions for a machine learning model differ.

regression reinforcement-learning

Paper
Add Code

Harnessing Density Ratios for Online Reinforcement Learning

no code implementations • 18 Jan 2024 • Philip Amortila, Dylan J. Foster, Nan Jiang, Ayush Sekhari, Tengyang Xie

The theories of offline and online reinforcement learning, despite having evolved in parallel, have begun to show signs of the possibility for a unification, with algorithms and analysis techniques for one setting often having natural counterparts in the other.

Offline RL reinforcement-learning

Paper
Add Code

The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation

no code implementations • 25 Jul 2023 • Philip Amortila, Nan Jiang, Csaba Szepesvári

Theoretical guarantees in reinforcement learning (RL) are known to suffer multiplicative blow-up factors with respect to the misspecification error of function approximation.

Off-policy evaluation

Paper
Add Code

A Few Expert Queries Suffices for Sample-Efficient RL with Resets and Linear Value Approximation

no code implementations • 18 Jul 2022 • Philip Amortila, Nan Jiang, Dhruv Madeka, Dean P. Foster

Towards establishing the minimal amount of expert queries needed, we show that, in the same setting, any learner whose exploration budget is polynomially-bounded (in terms of $d, H,$ and $|\mathcal{A}|$) will require at least $\tilde\Omega(\sqrt{d})$ oracle calls to recover a policy competing with the expert's value function.

Imitation Learning Reinforcement Learning (RL)

Paper
Add Code

On Query-efficient Planning in MDPs under Linear Realizability of the Optimal State-value Function

no code implementations • 3 Feb 2021 • Gellért Weisz, Philip Amortila, Barnabás Janzer, Yasin Abbasi-Yadkori, Nan Jiang, Csaba Szepesvári

We consider local planning in fixed-horizon MDPs with a generative model under the assumption that the optimal value function lies close to the span of a feature map.

Open-Ended Question Answering

Paper
Add Code

A Variant of the Wang-Foster-Kakade Lower Bound for the Discounted Setting

no code implementations • 2 Nov 2020 • Philip Amortila, Nan Jiang, Tengyang Xie

Recently, Wang et al. (2020) showed a highly intriguing hardness result for batch reinforcement learning (RL) with linearly realizable value function and good feature coverage in the finite-horizon case.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Exponential Lower Bounds for Planning in MDPs With Linearly-Realizable Optimal Action-Value Functions

no code implementations • 3 Oct 2020 • Gellért Weisz, Philip Amortila, Csaba Szepesvári

We consider the problem of local planning in fixed-horizon and discounted Markov Decision Processes (MDPs) with linear function approximation and a generative model under the assumption that the optimal action-value function lies in the span of a feature map that is available to the planner.

Paper
Add Code

Constrained Markov Decision Processes via Backward Value Functions

no code implementations • ICML 2020 • Harsh Satija, Philip Amortila, Joelle Pineau

In standard RL, the agent is incentivized to explore any behavior as long as it maximizes rewards, but in the real world, undesired behavior can damage either the system or the agent in a way that breaks the learning process itself.

Reinforcement Learning (RL)

Paper
Add Code

A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms

no code implementations • 27 Mar 2020 • Philip Amortila, Doina Precup, Prakash Panangaden, Marc G. Bellemare

We present a distributional approach to theoretical analyses of reinforcement learning algorithms for constant step-sizes.

Q-Learning reinforcement-learning +1

Paper
Add Code

Learning Graph Weighted Models on Pictures

no code implementations • 21 Jun 2018 • Philip Amortila, Guillaume Rabusseau

Graph Weighted Models (GWMs) have recently been proposed as a natural generalization of weighted automata over strings and trees to arbitrary families of labeled graphs (and hypergraphs).

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.