Search Results for author: Nevena Lazic

Found 18 papers, 3 papers with code

A New Look at Dynamic Regret for Non-Stationary Stochastic Bandits

no code implementations17 Jan 2022 Yasin Abbasi-Yadkori, Andras Gyorgy, Nevena Lazic

We propose a method that achieves, in $K$-armed bandit problems, a near-optimal $\widetilde O(\sqrt{K N(S+1)})$ dynamic regret, where $N$ is the time horizon of the problem and $S$ is the number of times the identity of the optimal arm changes, without prior knowledge of $S$.

Improved Regret Bound and Experience Replay in Regularized Policy Iteration

no code implementations25 Feb 2021 Nevena Lazic, Dong Yin, Yasin Abbasi-Yadkori, Csaba Szepesvari

We first show that the regret analysis of the Politex algorithm (a version of regularized policy iteration) can be sharpened from $O(T^{3/4})$ to $O(\sqrt{T})$ under nearly identical assumptions, and instantiate the bound with linear function approximation.

A maximum-entropy approach to off-policy evaluation in average-reward MDPs

no code implementations NeurIPS 2020 Nevena Lazic, Dong Yin, Mehrdad Farajtabar, Nir Levine, Dilan Gorur, Chris Harris, Dale Schuurmans

This work focuses on off-policy evaluation (OPE) with function approximation in infinite-horizon undiscounted Markov decision processes (MDPs).

Off-policy evaluation

Robotic Table Tennis with Model-Free Reinforcement Learning

no code implementations31 Mar 2020 Wenbo Gao, Laura Graesser, Krzysztof Choromanski, Xingyou Song, Nevena Lazic, Pannag Sanketi, Vikas Sindhwani, Navdeep Jaitly

We propose a model-free algorithm for learning efficient policies capable of returning table tennis balls by controlling robot joints at a rate of 100Hz.

reinforcement-learning Reinforcement Learning (RL)

Adaptive Approximate Policy Iteration

1 code implementation8 Feb 2020 Botao Hao, Nevena Lazic, Yasin Abbasi-Yadkori, Pooria Joulani, Csaba Szepesvari

This is an improvement over the best existing bound of $\tilde{O}(T^{3/4})$ for the average-reward case with function approximation.

Exploration-Enhanced POLITEX

no code implementations27 Aug 2019 Yasin Abbasi-Yadkori, Nevena Lazic, Csaba Szepesvari, Gellert Weisz

POLITEX has sublinear regret guarantees in uniformly-mixing MDPs when the value estimation error can be controlled, which can be satisfied if all policies sufficiently explore the environment.

Data center cooling using model-predictive control

1 code implementation NeurIPS 2018 Nevena Lazic, Craig Boutilier, Tyler Lu, Eehern Wong, Binz Roy, Mk Ryu, Greg Imwalle

Despite impressive recent advances in reinforcement learning (RL), its deployment in real-world physical systems is often complicated by unexpected events, limited data, and the potential for expensive failures.

Model Predictive Control reinforcement-learning +1

Online Linear Quadratic Control

no code implementations ICML 2018 Alon Cohen, Avinatan Hassidim, Tomer Koren, Nevena Lazic, Yishay Mansour, Kunal Talwar

We study the problem of controlling linear time-invariant systems with known noisy dynamics and adversarially chosen quadratic losses.

Model-Free Linear Quadratic Control via Reduction to Expert Prediction

no code implementations17 Apr 2018 Yasin Abbasi-Yadkori, Nevena Lazic, Csaba Szepesvari

Model-free approaches for reinforcement learning (RL) and continuous control find policies based only on past states and rewards, without fitting a model of the system dynamics.

Continuous Control Reinforcement Learning (RL)

Sketching and Neural Networks

no code implementations19 Apr 2016 Amit Daniely, Nevena Lazic, Yoram Singer, Kunal Talwar

In stark contrast, our approach of using improper learning, using a larger hypothesis class allows the sketch size to have a logarithmic dependence on the degree.

Plato: A Selective Context Model for Entity Resolution

no code implementations TACL 2015 Nevena Lazic, Amarnag Subramanya, Michael Ringgaard, Fern Pereira, o

We present Plato, a probabilistic model for entity resolution that includes a novel approach for handling noisy or uninformative features, and supplements labeled training data derived from Wikipedia with a very large unlabeled text corpus.

Entity Linking Entity Resolution

Context-Dependent Fine-Grained Entity Type Tagging

4 code implementations3 Dec 2014 Dan Gillick, Nevena Lazic, Kuzman Ganchev, Jesse Kirchner, David Huynh

We propose the task of context-dependent fine type tagging, where the set of acceptable labels for a mention is restricted to only those deducible from the local context (e. g. sentence or document).

Sentence Vocal Bursts Type Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.