no code implementations • 7 Mar 2025 • Matthew Faw, Constantine Caramanis, Jessica Hoffmann
We prove a new instance-dependent regret lower bound, which is larger than that in the standard bandit setting by a multiplicative function of $K$.
no code implementations • 31 Oct 2024 • Abhimanyu Das, Matthew Faw, Rajat Sen, Yichen Zhou
Our foundation model is specifically trained to utilize examples from multiple related time-series in its context window (in addition to the history of the target time-series) to help it adapt to the specific distribution of the target domain at inference time.
no code implementations • 13 Feb 2023 • Matthew Faw, Litu Rout, Constantine Caramanis, Sanjay Shakkottai
Despite the richness, an emerging line of works achieves the $\widetilde{\mathcal{O}}(\frac{1}{\sqrt{T}})$ rate of convergence when the noise of the stochastic gradients is deterministically and uniformly bounded.
no code implementations • 11 Feb 2022 • Matthew Faw, Isidoros Tziotis, Constantine Caramanis, Aryan Mokhtari, Sanjay Shakkottai, Rachel Ward
We study convergence rates of AdaGrad-Norm as an exemplar of adaptive stochastic gradient methods (SGD), where the step sizes change based on observed stochastic gradients, for minimizing non-convex, smooth objectives.
1 code implementation • NeurIPS 2020 • Matthew Faw, Rajat Sen, Karthikeyan Shanmugam, Constantine Caramanis, Sanjay Shakkottai
We consider a covariate shift problem where one has access to several different training datasets for the same learning problem and a small validation set which possibly differs from all the individual training distributions.