Search Results for author: Teodor V. Marinov

Found 9 papers, 1 papers with code

Stochastic Online Learning with Feedback Graphs: Finite-Time and Asymptotic Optimality

no code implementations20 Jun 2022 Teodor V. Marinov, Mehryar Mohri, Julian Zimmert

We revisit the problem of stochastic online learning with feedback graphs, with the goal of devising algorithms that are optimal, up to constants, both asymptotically and in finite time.

online learning

Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning

no code implementations NeurIPS 2021 Christoph Dann, Teodor V. Marinov, Mehryar Mohri, Julian Zimmert

Our results show that optimistic algorithms can not achieve the information-theoretic lower bounds even in deterministic MDPs unless there is a unique optimal policy.

reinforcement-learning

Corralling Stochastic Bandit Algorithms

no code implementations16 Jun 2020 Raman Arora, Teodor V. Marinov, Mehryar Mohri

We study the problem of corralling stochastic bandit algorithms, that is combining multiple bandit algorithms designed for a stochastic environment, with the goal of devising a corralling algorithm that performs almost as well as the best base algorithm.

Bandits with Feedback Graphs and Switching Costs

no code implementations NeurIPS 2019 Raman Arora, Teodor V. Marinov, Mehryar Mohri

We give a new algorithm whose regret guarantee depends only on the domination number of the graph.

Policy Regret in Repeated Games

no code implementations NeurIPS 2018 Raman Arora, Michael Dinitz, Teodor V. Marinov, Mehryar Mohri

We revisit the notion of policy regret and first show that there are online learning settings in which policy regret and external regret are incompatible: any sequence of play that achieves a favorable regret with respect to one definition must do poorly with respect to the other.

online learning

Streaming Kernel PCA with $\tilde{O}(\sqrt{n})$ Random Features

1 code implementation2 Aug 2018 Enayat Ullah, Poorya Mianjy, Teodor V. Marinov, Raman Arora

We study the statistical and computational aspects of kernel principal component analysis using random Fourier features and show that under mild assumptions, $O(\sqrt{n} \log n)$ features suffices to achieve $O(1/\epsilon^2)$ sample complexity.

Stochastic Approximation for Canonical Correlation Analysis

no code implementations NeurIPS 2017 Raman Arora, Teodor V. Marinov, Poorya Mianjy, Nathan Srebro

We propose novel first-order stochastic approximation algorithms for canonical correlation analysis (CCA).

Cannot find the paper you are looking for? You can Submit a new open access paper.