no code implementations • 3 Dec 2024 • Yuda Song, HANLIN ZHANG, Carson Eisenach, Sham Kakade, Dean Foster, Udaya Ghai
Self-improvement is a mechanism in Large Language Model (LLM) pre-training, post-training and test-time inference.
1 code implementation • 29 Oct 2024 • HANLIN ZHANG, Depen Morwani, Nikhil Vyas, Jingfeng Wu, Difan Zou, Udaya Ghai, Dean Foster, Sham Kakade
Training large-scale models under given resources requires careful design of parallelism strategies.
no code implementations • 24 Sep 2024 • Carson Eisenach, Udaya Ghai, Dhruv Madeka, Kari Torkkola, Dean Foster, Sham Kakade
This paper addresses the capacitated periodic review inventory control problem, focusing on a retailer managing multiple products with limited shared resources, such as storage or inbound labor at a facility.
1 code implementation • 8 Dec 2023 • Xinyi Chen, Angelica Chen, Dean Foster, Elad Hazan
We give a novel efficient algorithm for simultaneous external and internal regret minimization whose regret depends logarithmically on the number of actions.
1 code implementation • 7 Dec 2023 • HANLIN ZHANG, Yi-Fan Zhang, Yaodong Yu, Dhruv Madeka, Dean Foster, Eric Xing, Himabindu Lakkaraju, Sham Kakade
Accurate uncertainty quantification is crucial for the safe deployment of machine learning models, and prior research has demonstrated improvements in the calibration of modern language models (LMs).
no code implementations • 26 Oct 2023 • Sohrab Andaz, Carson Eisenach, Dhruv Madeka, Kari Torkkola, Randy Jia, Dean Foster, Sham Kakade
In this paper we address the problem of learning and backtesting inventory control policies in the presence of general arrival dynamics -- which we term as a quantity-over-time arrivals model (QOT).
no code implementations • 24 Oct 2023 • Dean Foster, Randy Jia, Dhruv Madeka
Solutions to address the periodic review inventory control problem with nonstationary random demand, lost sales, and stochastic vendor lead times typically involve making strong assumptions on the dynamics for either approximation or simulation, and applying methods such as optimization, dynamic programming, or reinforcement learning.
1 code implementation • 18 Jul 2023 • Jens Tuyls, Dhruv Madeka, Kari Torkkola, Dean Foster, Karthik Narasimhan, Sham Kakade
Inspired by recent work in Natural Language Processing (NLP) where "scaling up" has resulted in increasingly more capable LLMs, we investigate whether carefully scaling up model and data size can bring similar improvements in the imitation learning setting for single-agent games.
no code implementations • 14 Dec 2021 • Nilesh Tripuraneni, Dhruv Madeka, Dean Foster, Dominique Perrault-Joncas, Michael I. Jordan
The key insight of our procedure is that the noisy (but unbiased) difference-of-means estimate can be used as a ground truth ``label" on a portion of the RCT, to test the performance of an estimator trained on the other portion.
no code implementations • 2 Mar 2021 • Yucheng Lu, Youngsuk Park, Lifan Chen, Yuyang Wang, Christopher De Sa, Dean Foster
In large-scale time series forecasting, one often encounters the situation where the temporal patterns of time series, while drifting over time, differ from one another in the same dataset.
1 code implementation • 15 Feb 2021 • Rajat Sen, Alexander Rakhlin, Lexing Ying, Rahul Kidambi, Dean Foster, Daniel Hill, Inderjit Dhillon
We show that our algorithm has a regret guarantee of $O(k\sqrt{(A-k+1)T \log (|\mathcal{F}|T)})$, where $A$ is the total number of arms and $\mathcal{F}$ is the class containing the regression function, while only requiring $\tilde{O}(A)$ computation per time step.
Computational Efficiency
Extreme Multi-Label Classification
+3
no code implementations • ICLR 2021 • Ruosong Wang, Dean Foster, Sham M. Kakade
Function approximation methods coupled with batch reinforcement learning (or off-policy reinforcement learning) are providing an increasingly important framework to help alleviate the excessive sample complexity burden in modern reinforcement learning problems.
1 code implementation • NeurIPS 2019 • Sergul Aydore, Tianhao Zhu, Dean Foster
We introduce a local regret for non-convex models in a dynamic environment.
no code implementations • 28 May 2019 • Yuyang Wang, Alex Smola, Danielle C. Maddix, Jan Gasthaus, Dean Foster, Tim Januschowski
We provide both theoretical and empirical evidence for the soundness of our approach through a necessary and sufficient decomposition of exchangeable time series into a global and a local part.
no code implementations • 13 Nov 2018 • Sergul Aydore, Lee Dicker, Dean Foster
We consider an online learning process to forecast a sequence of outcomes for nonconvex models.
no code implementations • ICLR 2018 • João Sedoc, Jordan Rodu, Dean Foster, Lyle Ungar
This paper presents a novel variant of hierarchical hidden Markov models (HMMs), the multiscale hidden Markov model (MSHMM), and an associated spectral estimation and prediction scheme that is consistent, finds global optima, and is computationally efficient.
no code implementations • ICLR 2018 • João Sedoc, Dean Foster, Lyle Ungar
We introduce a novel approach to tree-to-tree learning, the neural tree transducer (NTT), a top-down depth first context-sensitive tree decoder, which is paired with recursive neural encoders.
1 code implementation • 13 Nov 2017 • John Thickstun, Zaid Harchaoui, Dean Foster, Sham M. Kakade
This paper explores a variety of models for frame-based music transcription, with an emphasis on the methods needed to reach state-of-the-art on human recordings.
no code implementations • ACL 2017 • Jo{\~a}o Sedoc, Jean Gallier, Dean Foster, Lyle Ungar
For spectral clustering using such word embeddings, words are points in a vector space where synonyms are linked with positive weights, while antonyms are linked with negative weights.
no code implementations • 7 Mar 2016 • Dean Foster, Satyen Kale, Howard Karloff
We consider the online sparse linear regression problem, which is the problem of sequentially making predictions observing only a limited number of features in each round, to minimize regret with respect to the best sparse linear regressor, where prediction accuracy is measured by square loss.
1 code implementation • 20 Jan 2016 • João Sedoc, Jean Gallier, Lyle Ungar, Dean Foster
Vector space representations of words capture many aspects of word similarity, but such methods tend to make vector spaces in which antonyms (as well as synonyms) are close to each other.
no code implementations • 26 Jun 2015 • Zhuang Ma, Yichao Lu, Dean Foster
In this paper, we tackle the problem of large scale CCA, where classical algorithms, usually requiring computing the product of two huge matrices and huge matrix decomposition, are computationally and storage expensive.