Search Results for author: Philip Thomas

Found 4 papers, 0 papers with code

Driver Profiling and Bayesian Workload Estimation Using Naturalistic Peripheral Detection Study Data

no code implementations • 26 Mar 2023 • Nermin Caber, Bashar I. Ahmad, Jiaming Liang, Simon Godsill, Alexandra Bremers, Philip Thomas, David Oxtoby, Lee Skrypchuk

Monitoring drivers' mental workload facilitates initiating and maintaining safe interactions with in-vehicle information systems, and thus delivers adaptive human machine interaction with reduced impact on the primary task of driving.

Time Series

Paper
Add Code

Asymptotically Unbiased Off-Policy Policy Evaluation when Reusing Old Data in Nonstationary Environments

no code implementations • 23 Feb 2023 • Vincent Liu, Yash Chandak, Philip Thomas, Martha White

In this work, we consider the off-policy policy evaluation problem for contextual bandits and finite horizon reinforcement learning in the nonstationary setting.

Multi-Armed Bandits regression +2

Paper
Add Code

Decoupling Gradient-Like Learning Rules from Representations

no code implementations • ICML 2018 • Philip Thomas, Christoph Dann, Emma Brunskill

When creating a machine learning system, we must make two decisions: what representation should be used (i. e., what parameterized function should be used) and what learning rule should be used to search through the resulting set of representable functions.

BIG-bench Machine Learning

Paper
Add Code

Proximal Reinforcement Learning: A New Theory of Sequential Decision Making in Primal-Dual Spaces

no code implementations • 26 May 2014 • Sridhar Mahadevan, Bo Liu, Philip Thomas, Will Dabney, Steve Giguere, Nicholas Jacek, Ian Gemp, Ji Liu

In this paper, we set forth a new vision of reinforcement learning developed by us over the past few years, one that yields mathematically rigorous solutions to longstanding important questions that have remained unresolved: (i) how to design reliable, convergent, and robust reinforcement learning algorithms (ii) how to guarantee that reinforcement learning satisfies pre-specified "safety" guarantees, and remains in a stable region of the parameter space (iii) how to design "off-policy" temporal difference learning algorithms in a reliable and stable manner, and finally (iv) how to integrate the study of reinforcement learning into the rich theory of stochastic optimization.

Decision Making reinforcement-learning +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.