Search Results for author: Teodor V. Marinov

Found 12 papers, 1 papers with code

Stochastic Approximation for Canonical Correlation Analysis

no code implementations • NeurIPS 2017 • Raman Arora, Teodor V. Marinov, Poorya Mianjy, Nathan Srebro

We propose novel first-order stochastic approximation algorithms for canonical correlation analysis (CCA).

Paper
Add Code

Streaming Kernel PCA with $\tilde{O}(\sqrt{n})$ Random Features

1 code implementation • 2 Aug 2018 • Enayat Ullah, Poorya Mianjy, Teodor V. Marinov, Raman Arora

We study the statistical and computational aspects of kernel principal component analysis using random Fourier features and show that under mild assumptions, $O(\sqrt{n} \log n)$ features suffices to achieve $O(1/\epsilon^2)$ sample complexity.

Paper
Code

Policy Regret in Repeated Games

no code implementations • NeurIPS 2018 • Raman Arora, Michael Dinitz, Teodor V. Marinov, Mehryar Mohri

We revisit the notion of policy regret and first show that there are online learning settings in which policy regret and external regret are incompatible: any sequence of play that achieves a favorable regret with respect to one definition must do poorly with respect to the other.

Paper
Add Code

Bandits with Feedback Graphs and Switching Costs

no code implementations • NeurIPS 2019 • Raman Arora, Teodor V. Marinov, Mehryar Mohri

We give a new algorithm whose regret guarantee depends only on the domination number of the graph.

counterfactual

Paper
Add Code

Private Stochastic Convex Optimization: Efficient Algorithms for Non-smooth Objectives

no code implementations • 22 Feb 2020 • Raman Arora, Teodor V. Marinov, Enayat Ullah

In this paper, we revisit the problem of private stochastic convex optimization.

Paper
Add Code

Corralling Stochastic Bandit Algorithms

no code implementations • 16 Jun 2020 • Raman Arora, Teodor V. Marinov, Mehryar Mohri

We study the problem of corralling stochastic bandit algorithms, that is combining multiple bandit algorithms designed for a stochastic environment, with the goal of devising a corralling algorithm that performs almost as well as the best base algorithm.

Paper
Add Code

Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning

no code implementations • NeurIPS 2021 • Christoph Dann, Teodor V. Marinov, Mehryar Mohri, Julian Zimmert

Our results show that optimistic algorithms can not achieve the information-theoretic lower bounds even in deterministic MDPs unless there is a unique optimal policy.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

The Pareto Frontier of model selection for general Contextual Bandits

no code implementations • NeurIPS 2021 • Teodor V. Marinov, Julian Zimmert

Recent progress in model selection raises the question of the fundamental limits of these techniques.

Model Selection Multi-Armed Bandits

Paper
Add Code

Stochastic Online Learning with Feedback Graphs: Finite-Time and Asymptotic Optimality

no code implementations • 20 Jun 2022 • Teodor V. Marinov, Mehryar Mohri, Julian Zimmert

We revisit the problem of stochastic online learning with feedback graphs, with the goal of devising algorithms that are optimal, up to constants, both asymptotically and in finite time.

Paper
Add Code

Leveraging User-Triggered Supervision in Contextual Bandits

no code implementations • 7 Feb 2023 • Alekh Agarwal, Claudio Gentile, Teodor V. Marinov

We study contextual bandit (CB) problems, where the user can sometimes respond with the best action in a given context.

Multi-Armed Bandits

Paper
Add Code

A Mechanism for Sample-Efficient In-Context Learning for Sparse Retrieval Tasks

no code implementations • 26 May 2023 • Jacob Abernethy, Alekh Agarwal, Teodor V. Marinov, Manfred K. Warmuth

We study the phenomenon of \textit{in-context learning} (ICL) exhibited by large language models, where they can adapt to a new learning task, given a handful of labeled examples, without any explicit parameter optimization.

In-Context Learning Retrieval

Paper
Add Code

Offline Imitation Learning from Multiple Baselines with Applications to Compiler Optimization

no code implementations • 28 Mar 2024 • Teodor V. Marinov, Alekh Agarwal, Mircea Trofin

This work studies a Reinforcement Learning (RL) problem in which we are given a set of trajectories collected with K baseline policies.

Compiler Optimization Imitation Learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.