Search Results for author: Wouter M. Koolen

Found 22 papers, 1 papers with code

Robust Online Convex Optimization in the Presence of Outliers

no code implementations5 Jul 2021 Tim van Erven, Sarah Sachs, Wouter M. Koolen, Wojciech Kotłowski

If the outliers are chosen adversarially, we show that a simple filtering strategy on extreme gradients incurs O(k) additive overhead compared to the usual regret bounds, and that this is unimprovable, which means that k needs to be sublinear in the number of rounds.

MetaGrad: Adaptation using Multiple Learning Rates in Online Learning

no code implementations12 Feb 2021 Tim van Erven, Wouter M. Koolen, Dirk van der Hoeven

We provide a new adaptive method for online convex optimization, MetaGrad, that is robust to general convex losses but achieves faster rates for a broad class of special functions, including exp-concave and strongly convex functions, but also various types of stochastic and non-stochastic functions without any curvature.

Regret Minimization in Heavy-Tailed Bandits

no code implementations7 Feb 2021 Shubhada Agrawal, Sandeep Juneja, Wouter M. Koolen

We show that our index concentrates faster than the well known truncated or trimmed empirical mean estimators for the mean of heavy-tailed distributions.

Optimal Best-Arm Identification Methods for Tail-Risk Measures

no code implementations NeurIPS 2021 Shubhada Agrawal, Wouter M. Koolen, Sandeep Juneja

Conditional value-at-risk (CVaR) and value-at-risk (VaR) are popular tail-risk measures in finance and insurance industries as well as in highly reliable, safety-critical uncertain environments where often the underlying probability distributions are heavy-tailed.

Structure Adaptive Algorithms for Stochastic Bandits

no code implementations ICML 2020 Rémy Degenne, Han Shao, Wouter M. Koolen

We study reward maximisation in a wide class of structured stochastic multi-armed bandit problems, where the mean rewards of arms satisfy some given structural constraints, e. g. linear, unimodal, sparse, etc.

Lipschitz and Comparator-Norm Adaptivity in Online Learning

no code implementations27 Feb 2020 Zakaria Mhammedi, Wouter M. Koolen

We study Online Convex Optimization in the unbounded setting where neither predictions nor gradient are constrained.

Non-Asymptotic Pure Exploration by Solving Games

no code implementations NeurIPS 2019 Rémy Degenne, Wouter M. Koolen, Pierre Ménard

Pure exploration (aka active testing) is the fundamental task of sequentially gathering information to answer a query about a stochastic environment.

Lipschitz Adaptivity with Multiple Learning Rates in Online Learning

no code implementations27 Feb 2019 Zakaria Mhammedi, Wouter M. Koolen, Tim van Erven

For MetaGrad, we further improve the computational efficiency of handling constraints on the domain of prediction, and we remove the need to specify the number of rounds in advance.

Active Learning Computational Efficiency

Pure Exploration with Multiple Correct Answers

no code implementations NeurIPS 2019 Rémy Degenne, Wouter M. Koolen

We present a new algorithm which extends Track-and-Stop to the multiple-answer case and has asymptotic sample complexity matching the lower bound.

Random Permutation Online Isotonic Regression

no code implementations NeurIPS 2017 Wojciech Kotlowski, Wouter M. Koolen, Alan Malek

We revisit isotonic regression on linear orders, the problem of fitting monotonic functions to best explain the data, in an online setting.

regression

Combining Adversarial Guarantees and Stochastic Fast Rates in Online Learning

no code implementations NeurIPS 2016 Wouter M. Koolen, Peter Grünwald, Tim van Erven

We consider online learning algorithms that guarantee worst-case regret rates in adversarial environments (so they can be deployed safely and will perform robustly), yet adapt optimally to favorable stochastic environments (so they will perform well in a variety of settings of practical importance).

MetaGrad: Multiple Learning Rates in Online Learning

1 code implementation NeurIPS 2016 Tim van Erven, Wouter M. Koolen

In online convex optimization it is well known that certain subclasses of objective functions are much easier than arbitrary convex functions.

Online Isotonic Regression

no code implementations14 Mar 2016 Wojciech Kotłowski, Wouter M. Koolen, Alan Malek

We then prove that the Exponential Weights algorithm played over a covering net of isotonic functions has a regret bounded by $O\big(T^{1/3} \log^{2/3}(T)\big)$ and present a matching $\Omega(T^{1/3})$ lower bound on regret.

regression

Second-order Quantile Methods for Experts and Combinatorial Games

no code implementations27 Feb 2015 Wouter M. Koolen, Tim van Erven

We aim to design strategies for sequential decision making that adjust to the difficulty of the learning problem.

Decision Making

Efficient Minimax Strategies for Square Loss Games

no code implementations NeurIPS 2014 Wouter M. Koolen, Alan Malek, Peter L. Bartlett

We consider online prediction problems where the loss between the prediction and the outcome is measured by the squared Euclidean distance and its generalization, the squared Mahalanobis distance.

Density Estimation

Learning the Learning Rate for Prediction with Expert Advice

no code implementations NeurIPS 2014 Wouter M. Koolen, Tim van Erven, Peter Grünwald

Most standard algorithms for prediction with expert advice depend on a parameter called the learning rate.

The Pareto Regret Frontier

no code implementations NeurIPS 2013 Wouter M. Koolen

In the common case of large but structured expert sets we typically wish to keep the regret especially small compared to simple experts, at the cost of modest additional overhead compared to more complex others.

Universal Codes from Switching Strategies

no code implementations26 Nov 2013 Wouter M. Koolen, Steven de Rooij

We discuss algorithms for combining sequential prediction strategies, a task which can be viewed as a natural generalisation of the concept of universal coding.

LEMMA Time Series +1

Putting Bayes to sleep

no code implementations NeurIPS 2012 Dmitry Adamskiy, Manfred K. Warmuth, Wouter M. Koolen

If the nature of the data is changing over time in that different models predict well on different segments of the data, then adaptivity is typically achieved by mixing into the weights in each round a bit of the initial prior (kind of like a weak restart).

Learning Eigenvectors for Free

no code implementations NeurIPS 2011 Wouter M. Koolen, Wojciech Kotlowski, Manfred K. Warmuth

In this extension, the alphabet of $n$ outcomes is replaced by the set of all dyads, i. e. outer products $\u\u^\top$ where $\u$ is a vector in $\R^n$ of unit length.

Adaptive Hedge

no code implementations NeurIPS 2011 Tim V. Erven, Wouter M. Koolen, Steven D. Rooij, Peter Grünwald

In most previous analyses the learning rate was carefully tuned to obtain optimal worst-case performance, leading to suboptimal performance on easy instances, for example when there exists an action that is significantly better than all others.

Cannot find the paper you are looking for? You can Submit a new open access paper.