no code implementations • NeurIPS 2017 • Raman Arora, Teodor V. Marinov, Poorya Mianjy, Nathan Srebro
We propose novel first-order stochastic approximation algorithms for canonical correlation analysis (CCA).
1 code implementation • 2 Aug 2018 • Enayat Ullah, Poorya Mianjy, Teodor V. Marinov, Raman Arora
We study the statistical and computational aspects of kernel principal component analysis using random Fourier features and show that under mild assumptions, $O(\sqrt{n} \log n)$ features suffices to achieve $O(1/\epsilon^2)$ sample complexity.
no code implementations • NeurIPS 2018 • Raman Arora, Michael Dinitz, Teodor V. Marinov, Mehryar Mohri
We revisit the notion of policy regret and first show that there are online learning settings in which policy regret and external regret are incompatible: any sequence of play that achieves a favorable regret with respect to one definition must do poorly with respect to the other.
no code implementations • NeurIPS 2019 • Raman Arora, Teodor V. Marinov, Mehryar Mohri
We give a new algorithm whose regret guarantee depends only on the domination number of the graph.
no code implementations • 22 Feb 2020 • Raman Arora, Teodor V. Marinov, Enayat Ullah
In this paper, we revisit the problem of private stochastic convex optimization.
no code implementations • 16 Jun 2020 • Raman Arora, Teodor V. Marinov, Mehryar Mohri
We study the problem of corralling stochastic bandit algorithms, that is combining multiple bandit algorithms designed for a stochastic environment, with the goal of devising a corralling algorithm that performs almost as well as the best base algorithm.
no code implementations • NeurIPS 2021 • Christoph Dann, Teodor V. Marinov, Mehryar Mohri, Julian Zimmert
Our results show that optimistic algorithms can not achieve the information-theoretic lower bounds even in deterministic MDPs unless there is a unique optimal policy.
no code implementations • NeurIPS 2021 • Teodor V. Marinov, Julian Zimmert
Recent progress in model selection raises the question of the fundamental limits of these techniques.
no code implementations • 20 Jun 2022 • Teodor V. Marinov, Mehryar Mohri, Julian Zimmert
We revisit the problem of stochastic online learning with feedback graphs, with the goal of devising algorithms that are optimal, up to constants, both asymptotically and in finite time.
no code implementations • 7 Feb 2023 • Alekh Agarwal, Claudio Gentile, Teodor V. Marinov
We study contextual bandit (CB) problems, where the user can sometimes respond with the best action in a given context.
no code implementations • 26 May 2023 • Jacob Abernethy, Alekh Agarwal, Teodor V. Marinov, Manfred K. Warmuth
We study the phenomenon of \textit{in-context learning} (ICL) exhibited by large language models, where they can adapt to a new learning task, given a handful of labeled examples, without any explicit parameter optimization.
no code implementations • 28 Mar 2024 • Teodor V. Marinov, Alekh Agarwal, Mircea Trofin
This work studies a Reinforcement Learning (RL) problem in which we are given a set of trajectories collected with K baseline policies.