Search Results for author: Dean Foster

Found 17 papers, 5 papers with code

A Framework for the Meta-Analysis of Randomized Experiments with Applications to Heavy-Tailed Response Data

no code implementations14 Dec 2021 Nilesh Tripuraneni, Dhruv Madeka, Dean Foster, Dominique Perrault-Joncas, Michael I. Jordan

The key insight of our procedure is that the noisy (but unbiased) difference-of-means estimate can be used as a ground truth "label" on a portion of the RCT, to test the performance of an estimator trained on the other portion.

Variance Reduced Training with Stratified Sampling for Forecasting Models

no code implementations2 Mar 2021 Yucheng Lu, Youngsuk Park, Lifan Chen, Yuyang Wang, Christopher De Sa, Dean Foster

In large-scale time series forecasting, one often encounters the situation where the temporal patterns of time series, while drifting over time, differ from one another in the same dataset.

Time Series Time Series Forecasting

Top-$k$ eXtreme Contextual Bandits with Arm Hierarchy

1 code implementation15 Feb 2021 Rajat Sen, Alexander Rakhlin, Lexing Ying, Rahul Kidambi, Dean Foster, Daniel Hill, Inderjit Dhillon

We show that our algorithm has a regret guarantee of $O(k\sqrt{(A-k+1)T \log (|\mathcal{F}|T)})$, where $A$ is the total number of arms and $\mathcal{F}$ is the class containing the regression function, while only requiring $\tilde{O}(A)$ computation per time step.

Extreme Multi-Label Classification Multi-Armed Bandits +2

What are the Statistical Limits of Batch RL with Linear Function Approximation?

no code implementations ICLR 2021 Ruosong Wang, Dean Foster, Sham M. Kakade

Function approximation methods coupled with batch reinforcement learning (or off-policy reinforcement learning) are providing an increasingly important framework to help alleviate the excessive sample complexity burden in modern reinforcement learning problems.


Deep Factors for Forecasting

no code implementations28 May 2019 Yuyang Wang, Alex Smola, Danielle C. Maddix, Jan Gasthaus, Dean Foster, Tim Januschowski

We provide both theoretical and empirical evidence for the soundness of our approach through a necessary and sufficient decomposition of exchangeable time series into a global and a local part.

Time Series

A Local Regret in Nonconvex Online Learning

no code implementations13 Nov 2018 Sergul Aydore, Lee Dicker, Dean Foster

We consider an online learning process to forecast a sequence of outcomes for nonconvex models.

online learning

Multiscale Hidden Markov Models For Covariance Prediction

no code implementations ICLR 2018 João Sedoc, Jordan Rodu, Dean Foster, Lyle Ungar

This paper presents a novel variant of hierarchical hidden Markov models (HMMs), the multiscale hidden Markov model (MSHMM), and an associated spectral estimation and prediction scheme that is consistent, finds global optima, and is computationally efficient.

Neural Tree Transducers for Tree to Tree Learning

no code implementations ICLR 2018 João Sedoc, Dean Foster, Lyle Ungar

We introduce a novel approach to tree-to-tree learning, the neural tree transducer (NTT), a top-down depth first context-sensitive tree decoder, which is paired with recursive neural encoders.

Invariances and Data Augmentation for Supervised Music Transcription

1 code implementation13 Nov 2017 John Thickstun, Zaid Harchaoui, Dean Foster, Sham M. Kakade

This paper explores a variety of models for frame-based music transcription, with an emphasis on the methods needed to reach state-of-the-art on human recordings.

Data Augmentation Frame +2

Semantic Word Clusters Using Signed Spectral Clustering

no code implementations ACL 2017 Jo{\~a}o Sedoc, Jean Gallier, Dean Foster, Lyle Ungar

For spectral clustering using such word embeddings, words are points in a vector space where synonyms are linked with positive weights, while antonyms are linked with negative weights.

Graph Clustering Semantic Textual Similarity +2

Online Sparse Linear Regression

no code implementations7 Mar 2016 Dean Foster, Satyen Kale, Howard Karloff

We consider the online sparse linear regression problem, which is the problem of sequentially making predictions observing only a limited number of features in each round, to minimize regret with respect to the best sparse linear regressor, where prediction accuracy is measured by square loss.

Semantic Word Clusters Using Signed Normalized Graph Cuts

1 code implementation20 Jan 2016 João Sedoc, Jean Gallier, Lyle Ungar, Dean Foster

Vector space representations of words capture many aspects of word similarity, but such methods tend to make vector spaces in which antonyms (as well as synonyms) are close to each other.

Word Similarity

Finding Linear Structure in Large Datasets with Scalable Canonical Correlation Analysis

no code implementations26 Jun 2015 Zhuang Ma, Yichao Lu, Dean Foster

In this paper, we tackle the problem of large scale CCA, where classical algorithms, usually requiring computing the product of two huge matrices and huge matrix decomposition, are computationally and storage expensive.

Stochastic Optimization

Cannot find the paper you are looking for? You can Submit a new open access paper.