Search Results for author: Ilja Kuzborskij

Found 20 papers, 1 papers with code

Better-than-KL PAC-Bayes Bounds

no code implementations14 Feb 2024 Ilja Kuzborskij, Kwang-Sung Jun, Yulian Wu, Kyoungseok Jang, Francesco Orabona

In this paper, we consider the problem of proving concentration inequalities to estimate the mean of the sequence.

Inductive Bias

Tighter PAC-Bayes Bounds Through Coin-Betting

no code implementations12 Feb 2023 Kyoungseok Jang, Kwang-Sung Jun, Ilja Kuzborskij, Francesco Orabona

We consider the problem of estimating the mean of a sequence of random elements $f(X_1, \theta)$ $, \ldots, $ $f(X_n, \theta)$ where $f$ is a fixed scalar function, $S=(X_1, \ldots, X_n)$ are independent random variables, and $\theta$ is a possibly $S$-dependent parameter.

Learning Lipschitz Functions by GD-trained Shallow Overparameterized ReLU Neural Networks

no code implementations28 Dec 2022 Ilja Kuzborskij, Csaba Szepesvári

We explore the ability of overparameterized shallow ReLU neural networks to learn Lipschitz, nondifferentiable, bounded functions with additive noise when trained by Gradient Descent (GD).

Stability & Generalisation of Gradient Descent for Shallow Neural Networks without the Neural Tangent Kernel

no code implementations NeurIPS 2021 Dominic Richards, Ilja Kuzborskij

We revisit on-average algorithmic stability of GD for training overparameterised shallow neural networks and prove new generalisation and excess risk bounds without the NTK or PL assumptions.

On the Role of Optimization in Double Descent: A Least Squares Study

no code implementations NeurIPS 2021 Ilja Kuzborskij, Csaba Szepesvári, Omar Rivasplata, Amal Rannen-Triki, Razvan Pascanu

Empirically it has been observed that the performance of deep neural networks steadily improves as we increase model size, contradicting the classical view on overfitting and generalization.

Nonparametric Regression with Shallow Overparameterized Neural Networks Trained by GD with Early Stopping

no code implementations12 Jul 2021 Ilja Kuzborskij, Csaba Szepesvári

We explore the ability of overparameterized shallow neural networks to learn Lipschitz regression functions with and without label noise when trained by Gradient Descent (GD).

regression

A Distribution-Dependent Analysis of Meta-Learning

no code implementations31 Oct 2020 Mikhail Konobeev, Ilja Kuzborskij, Csaba Szepesvári

A key problem in the theory of meta-learning is to understand how the task distributions influence transfer risk, the expected error of a meta-learner on a new task drawn from the unknown task distribution.

Meta-Learning regression +1

PAC-Bayes Analysis Beyond the Usual Bounds

no code implementations NeurIPS 2020 Omar Rivasplata, Ilja Kuzborskij, Csaba Szepesvari, John Shawe-Taylor

Specifically, we present a basic PAC-Bayes inequality for stochastic kernels, from which one may derive extensions of various known PAC-Bayes bounds as well as novel bounds.

valid

Confident Off-Policy Evaluation and Selection through Self-Normalized Importance Weighting

1 code implementation18 Jun 2020 Ilja Kuzborskij, Claire Vernade, András György, Csaba Szepesvári

We consider off-policy evaluation in the contextual bandit setting for the purpose of obtaining a robust off-policy selection strategy, where the selection strategy is evaluated based on the value of the chosen policy in a set of proposal (target) policies.

Multi-Armed Bandits Off-policy evaluation

Locally-Adaptive Nonparametric Online Learning

no code implementations NeurIPS 2020 Ilja Kuzborskij, Nicolò Cesa-Bianchi

When competing against "simple" locality profiles, our technique delivers regret bounds that are significantly better than those proven using the previous approach.

Efron-Stein PAC-Bayesian Inequalities

no code implementations4 Sep 2019 Ilja Kuzborskij, Csaba Szepesvári

We prove semi-empirical concentration inequalities for random variables which are given as possibly nonlinear functions of independent random variables.

Generalization Bounds Off-policy evaluation

Distribution-Dependent Analysis of Gibbs-ERM Principle

no code implementations5 Feb 2019 Ilja Kuzborskij, Nicolò Cesa-Bianchi, Csaba Szepesvári

This is a well-established notion of effective dimension appearing in several previous works, including the analyses of SGD and ridge regression, but ours is the first work that brings this dimension to the analysis of learning using Gibbs densities.

Stochastic Optimization

Efficient Linear Bandits through Matrix Sketching

no code implementations28 Sep 2018 Ilja Kuzborskij, Leonardo Cella, Nicolò Cesa-Bianchi

More precisely, we show that a sketch of size $m$ allows a $\mathcal{O}(md)$ update time for both algorithms, as opposed to $\Omega(d^2)$ required by their non-sketched versions in general (where $d$ is the dimension of context vectors).

Thompson Sampling

Nonparametric Online Regression while Learning the Metric

no code implementations NeurIPS 2017 Ilja Kuzborskij, Nicolò Cesa-Bianchi

We study algorithms for online nonparametric regression that learn the directions along which the regression function is smoother.

regression

Data-Dependent Stability of Stochastic Gradient Descent

no code implementations ICML 2018 Ilja Kuzborskij, Christoph H. Lampert

We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD), and employ it to develop novel generalization bounds.

Generalization Bounds

When Naive Bayes Nearest Neighbors Meet Convolutional Neural Networks

no code implementations CVPR 2016 Ilja Kuzborskij, Fabio Maria Carlucci, Barbara Caputo

Since Convolutional Neural Networks (CNNs) have become the leading learning paradigm in visual recognition, Naive Bayes Nearest Neighbor (NBNN)-based classifiers have lost momentum in the community.

Domain Adaptation

When Naïve Bayes Nearest Neighbours Meet Convolutional Neural Networks

no code implementations12 Nov 2015 Ilja Kuzborskij, Fabio Maria Carlucci, Barbara Caputo

Since Convolutional Neural Networks (CNNs) have become the leading learning paradigm in visual recognition, Naive Bayes Nearest Neighbour (NBNN)-based classifiers have lost momentum in the community.

Domain Adaptation

Fast Rates by Transferring from Auxiliary Hypotheses

no code implementations4 Dec 2014 Ilja Kuzborskij, Francesco Orabona

In this work we consider the learning setting where, in addition to the training set, the learner receives a collection of auxiliary hypotheses originating from other tasks.

Scalable Greedy Algorithms for Transfer Learning

no code implementations6 Aug 2014 Ilja Kuzborskij, Francesco Orabona, Barbara Caputo

In this paper we consider the binary transfer learning problem, focusing on how to select and combine sources from a large pool to yield a good performance on a target task.

feature selection Transfer Learning

From N to N+1: Multiclass Transfer Incremental Learning

no code implementations CVPR 2013 Ilja Kuzborskij, Francesco Orabona, Barbara Caputo

Since the seminal work of Thrun [17], the learning to learn paradigm has been defined as the ability of an agent to improve its performance at each task with experience, with the number of tasks.

Incremental Learning Object Categorization +1

Cannot find the paper you are looking for? You can Submit a new open access paper.