no code implementations • 1 May 2023 • Dylan J. Foster, Dean P. Foster, Noah Golowich, Alexander Rakhlin
Compared to the best results for the single-agent setting, our bounds have additional gaps.
no code implementations • 14 Nov 2022 • Zeyu Jia, Randy Jia, Dhruv Madeka, Dean P. Foster
We study the problem of Reinforcement Learning (RL) with linear function approximation, i. e. assuming the optimal action-value function is linear in a known $d$-dimensional feature mapping.
no code implementations • 13 Oct 2022 • Dean P. Foster, Sergiu Hart
Calibration means that forecasts and average realized frequencies are close.
no code implementations • 13 Oct 2022 • Dean P. Foster, Sergiu Hart
We propose to smooth out the calibration score, which measures how good a forecaster is, by combining nearby forecasts.
no code implementations • 6 Oct 2022 • Dhruv Madeka, Kari Torkkola, Carson Eisenach, Anna Luo, Dean P. Foster, Sham M. Kakade
This work provides a Deep Reinforcement Learning approach to solving a periodic review inventory control system with stochastic vendor lead times, lost sales, correlated demand, and price matching.
no code implementations • 11 Sep 2022 • Dean P. Foster, Sergiu Hart
In order to identify expertise, forecasters should not be tested by their calibration score, which can always be made arbitrarily small, but rather by their Brier score.
no code implementations • 18 Jul 2022 • Philip Amortila, Nan Jiang, Dhruv Madeka, Dean P. Foster
Towards establishing the minimal amount of expert queries needed, we show that, in the same setting, any learner whose exploration budget is polynomially-bounded (in terms of $d, H,$ and $|\mathcal{A}|$) will require at least $\tilde\Omega(\sqrt{d})$ oracle calls to recover a policy competing with the expert's value function.
no code implementations • 3 Dec 2021 • Dean P. Foster, Alexander Rakhlin
We consider the problem of contextual bandits where actions are subsets of a ground set and mean rewards are modeled by an unknown monotone submodular function that belongs to a class $\mathcal{F}$.
no code implementations • NeurIPS 2021 • Difan Zou, Jingfeng Wu, Vladimir Braverman, Quanquan Gu, Dean P. Foster, Sham M. Kakade
Stochastic gradient descent (SGD) exhibits strong algorithmic regularization effects in practice, which has been hypothesized to play an important role in the generalization of modern machine learning approaches.
no code implementations • 14 May 2021 • Dean P. Foster, Robert A. Stine
In addition to being calibrated, a threshold martingale has quadratic variation that accumulates to a total determined by a quantile of the initial forecast distribution.
no code implementations • 22 Oct 2020 • Ruosong Wang, Dean P. Foster, Sham M. Kakade
Offline reinforcement learning seeks to utilize offline (observational) data to guide the learning of (causal) sequential decision making strategies.
no code implementations • 20 Nov 2018 • John Thickstun, Zaid Harchaoui, Dean P. Foster, Sham M. Kakade
This paper introduces a novel recurrent model for music composition that is tailored to the structure of polyphonic music.
no code implementations • NeurIPS 2014 • Yichao Lu, Dean P. Foster
Canonical Correlation Analysis (CCA) is a widely used statistical tool with both well established theory and favorable performance for a wide range of machine learning problems.
no code implementations • 15 May 2014 • Yichao Lu, Dean P. Foster
We propose a new two stage algorithm LING for large scale regression problems.
no code implementations • NeurIPS 2013 • Lee H. Dicker, Dean P. Foster
One of the salient features of our analysis is that the problems studied here are easier when the dimension of $x_i$ is large; in other words, prediction becomes easier when more context is provided.
no code implementations • NeurIPS 2013 • Paramveer Dhillon, Yichao Lu, Dean P. Foster, Lyle Ungar
We address the problem of fast estimation of ordinary least squares (OLS) from large amounts of data ($n \gg p$).
no code implementations • NeurIPS 2013 • Yichao Lu, Paramveer Dhillon, Dean P. Foster, Lyle Ungar
We propose a fast algorithm for ridge regression when the number of features is much larger than the number of observations ($p \gg n$).
no code implementations • NeurIPS 2012 • Anima Anandkumar, Dean P. Foster, Daniel J. Hsu, Sham M. Kakade, Yi-Kai Liu
This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of topic models, including Latent Dirichlet Allocation (LDA).
no code implementations • NeurIPS 2011 • Paramveer Dhillon, Dean P. Foster, Lyle H. Ungar
Recently, there has been substantial interest in using large amounts of unlabeled data to learn word representations which can then be used as features in supervised classifiers for NLP tasks.
no code implementations • NeurIPS 2011 • Alekh Agarwal, Dean P. Foster, Daniel J. Hsu, Sham M. Kakade, Alexander Rakhlin
This paper addresses the problem of minimizing a convex, Lipschitz function $f$ over a convex, compact set $X$ under a stochastic bandit feedback model.
no code implementations • 4 May 2011 • Paramveer S. Dhillon, Dean P. Foster, Sham M. Kakade, Lyle H. Ungar
We compare the risk of ridge regression to a simple variant of ordinary least squares, in which one simply projects the data onto a finite dimensional subspace (as specified by a Principal Component Analysis) and then performs an ordinary (un-regularized) least squares regression in this subspace.