Search Results for author: Dean P. Foster

Found 23 papers, 0 papers with code

Linear Reinforcement Learning with Ball Structure Action Space

no code implementations14 Nov 2022 Zeyu Jia, Randy Jia, Dhruv Madeka, Dean P. Foster

We study the problem of Reinforcement Learning (RL) with linear function approximation, i. e. assuming the optimal action-value function is linear in a known $d$-dimensional feature mapping.

reinforcement-learning Reinforcement Learning (RL)

Smooth Calibration, Leaky Forecasts, Finite Recall, and Nash Dynamics

no code implementations13 Oct 2022 Dean P. Foster, Sergiu Hart

We propose to smooth out the calibration score, which measures how good a forecaster is, by combining nearby forecasts.

Forecast Hedging and Calibration

no code implementations13 Oct 2022 Dean P. Foster, Sergiu Hart

Calibration means that forecasts and average realized frequencies are close.

Deep Inventory Management

no code implementations6 Oct 2022 Dhruv Madeka, Kari Torkkola, Carson Eisenach, Anna Luo, Dean P. Foster, Sham M. Kakade

This work provides a Deep Reinforcement Learning approach to solving a periodic review inventory control system with stochastic vendor lead times, lost sales, correlated demand, and price matching.

Management Model-based Reinforcement Learning +2

"Calibeating": Beating Forecasters at Their Own Game

no code implementations11 Sep 2022 Dean P. Foster, Sergiu Hart

In order to identify expertise, forecasters should not be tested by their calibration score, which can always be made arbitrarily small, but rather by their Brier score.

A Few Expert Queries Suffices for Sample-Efficient RL with Resets and Linear Value Approximation

no code implementations18 Jul 2022 Philip Amortila, Nan Jiang, Dhruv Madeka, Dean P. Foster

Towards establishing the minimal amount of expert queries needed, we show that, in the same setting, any learner whose exploration budget is polynomially-bounded (in terms of $d, H,$ and $|\mathcal{A}|$) will require at least $\tilde\Omega(\sqrt{d})$ oracle calls to recover a policy competing with the expert's value function.

Imitation Learning Reinforcement Learning (RL)

On Submodular Contextual Bandits

no code implementations3 Dec 2021 Dean P. Foster, Alexander Rakhlin

We consider the problem of contextual bandits where actions are subsets of a ground set and mean rewards are modeled by an unknown monotone submodular function that belongs to a class $\mathcal{F}$.

Multi-Armed Bandits

The Benefits of Implicit Regularization from SGD in Least Squares Problems

no code implementations NeurIPS 2021 Difan Zou, Jingfeng Wu, Vladimir Braverman, Quanquan Gu, Dean P. Foster, Sham M. Kakade

Stochastic gradient descent (SGD) exhibits strong algorithmic regularization effects in practice, which has been hypothesized to play an important role in the generalization of modern machine learning approaches.

regression

Threshold Martingales and the Evolution of Forecasts

no code implementations14 May 2021 Dean P. Foster, Robert A. Stine

In addition to being calibrated, a threshold martingale has quadratic variation that accumulates to a total determined by a quantile of the initial forecast distribution.

What are the Statistical Limits of Offline RL with Linear Function Approximation?

no code implementations22 Oct 2020 Ruosong Wang, Dean P. Foster, Sham M. Kakade

Offline reinforcement learning seeks to utilize offline (observational) data to guide the learning of (causal) sequential decision making strategies.

Decision Making Offline RL +2

Coupled Recurrent Models for Polyphonic Music Composition

no code implementations20 Nov 2018 John Thickstun, Zaid Harchaoui, Dean P. Foster, Sham M. Kakade

This paper introduces a novel recurrent model for music composition that is tailored to the structure of polyphonic music.

Time Series Analysis

Large scale canonical correlation analysis with iterative least squares

no code implementations NeurIPS 2014 Yichao Lu, Dean P. Foster

Canonical Correlation Analysis (CCA) is a widely used statistical tool with both well established theory and favorable performance for a wide range of machine learning problems.

BIG-bench Machine Learning

Faster Ridge Regression via the Subsampled Randomized Hadamard Transform

no code implementations NeurIPS 2013 Yichao Lu, Paramveer Dhillon, Dean P. Foster, Lyle Ungar

We propose a fast algorithm for ridge regression when the number of features is much larger than the number of observations ($p \gg n$).

regression

One-shot learning and big data with n=2

no code implementations NeurIPS 2013 Lee H. Dicker, Dean P. Foster

One of the salient features of our analysis is that the problems studied here are easier when the dimension of $x_i$ is large; in other words, prediction becomes easier when more context is provided.

One-Shot Learning

New Subsampling Algorithms for Fast Least Squares Regression

no code implementations NeurIPS 2013 Paramveer Dhillon, Yichao Lu, Dean P. Foster, Lyle Ungar

We address the problem of fast estimation of ordinary least squares (OLS) from large amounts of data ($n \gg p$).

regression

A Spectral Algorithm for Latent Dirichlet Allocation

no code implementations NeurIPS 2012 Anima Anandkumar, Dean P. Foster, Daniel J. Hsu, Sham M. Kakade, Yi-Kai Liu

This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of topic models, including Latent Dirichlet Allocation (LDA).

Clustering Topic Models

Stochastic convex optimization with bandit feedback

no code implementations NeurIPS 2011 Alekh Agarwal, Dean P. Foster, Daniel J. Hsu, Sham M. Kakade, Alexander Rakhlin

This paper addresses the problem of minimizing a convex, Lipschitz function $f$ over a convex, compact set $X$ under a stochastic bandit feedback model.

Multi-View Learning of Word Embeddings via CCA

no code implementations NeurIPS 2011 Paramveer Dhillon, Dean P. Foster, Lyle H. Ungar

Recently, there has been substantial interest in using large amounts of unlabeled data to learn word representations which can then be used as features in supervised classifiers for NLP tasks.

Chunking MULTI-VIEW LEARNING +4

A Risk Comparison of Ordinary Least Squares vs Ridge Regression

no code implementations4 May 2011 Paramveer S. Dhillon, Dean P. Foster, Sham M. Kakade, Lyle H. Ungar

We compare the risk of ridge regression to a simple variant of ordinary least squares, in which one simply projects the data onto a finite dimensional subspace (as specified by a Principal Component Analysis) and then performs an ordinary (un-regularized) least squares regression in this subspace.

regression

Cannot find the paper you are looking for? You can Submit a new open access paper.