Search Results for author: Benjamin Guedj

Found 56 papers, 22 papers with code

Closed-form Filtering for Non-linear Systems

no code implementations15 Feb 2024 Théophile Cantelobre, Carlo Ciliberto, Benjamin Guedj, Alessandro Rudi

Sequential Bayesian Filtering aims to estimate the current state distribution of a Hidden Markov Model, given the past observations.

Computational Efficiency

A PAC-Bayesian Link Between Generalisation and Flat Minima

no code implementations13 Feb 2024 Maxime Haddouche, Paul Viallard, Umut Simsekli, Benjamin Guedj

Modern machine learning usually involves predictors in the overparametrised setting (number of trained parameters greater than dataset size), and their training yield not only good performances on training data, but also good generalisation capacity.

Tighter Generalisation Bounds via Interpolation

no code implementations7 Feb 2024 Paul Viallard, Maxime Haddouche, Umut Şimşekli, Benjamin Guedj

We also instantiate our bounds as training objectives, yielding non-trivial guarantees and practical performances.

A note on regularised NTK dynamics with an application to PAC-Bayesian training

no code implementations20 Dec 2023 Eugenio Clerico, Benjamin Guedj

We establish explicit dynamics for neural networks whose training objective has a regularising term that constrains the parameters to remain close to their initial value.

Federated Learning with Nonvacuous Generalisation Bounds

no code implementations17 Oct 2023 Pierre Jobic, Maxime Haddouche, Benjamin Guedj

We introduce a novel strategy to train randomised predictors in federated learning, where each node of the network aims at preserving its privacy by releasing a local predictor but keeping secret its training dataset with respect to the other nodes.

Federated Learning

Comparing Comparators in Generalization Bounds

1 code implementation16 Oct 2023 Fredrik Hellström, Benjamin Guedj

We derive generic information-theoretic and PAC-Bayesian generalization bounds involving an arbitrary convex comparator function, which measures the discrepancy between the training and population loss.

Generalization Bounds

Generalization Bounds: Perspectives from Information Theory and PAC-Bayes

no code implementations8 Sep 2023 Fredrik Hellström, Giuseppe Durisi, Benjamin Guedj, Maxim Raginsky

Over the past decades, the PAC-Bayesian approach has been established as a flexible framework to address the generalization capabilities of machine learning algorithms, and design new ones.

Generalization Bounds

Wasserstein PAC-Bayes Learning: Exploiting Optimisation Guarantees to Explain Generalisation

no code implementations14 Apr 2023 Maxime Haddouche, Benjamin Guedj

PAC-Bayes learning is an established framework to both assess the generalisation ability of learning algorithms, and design new learning algorithm by exploiting generalisation bounds as training objectives.

Optimistically Tempered Online Learning

no code implementations18 Jan 2023 Maxime Haddouche, Olivier Wintenberger, Benjamin Guedj

Optimistic Online Learning algorithms have been developed to exploit expert advices, assumed optimistically to be always useful.

Tighter PAC-Bayes Generalisation Bounds by Leveraging Example Difficulty

no code implementations20 Oct 2022 Felix Biggs, Benjamin Guedj

We introduce a modified version of the excess risk, which can be used to obtain tighter, fast-rate PAC-Bayesian generalisation bounds.

PAC-Bayes Generalisation Bounds for Heavy-Tailed Losses through Supermartingales

no code implementations3 Oct 2022 Maxime Haddouche, Benjamin Guedj

While PAC-Bayes is now an established learning framework for light-tailed losses (\emph{e. g.}, subgaussian or subexponential), its extension to the case of heavy-tailed losses remains largely uncharted and has attracted a growing interest in recent years.

Generalisation under gradient descent via deterministic PAC-Bayes

no code implementations6 Sep 2022 Eugenio Clerico, Tyler Farghly, George Deligiannidis, Benjamin Guedj, Arnaud Doucet

We establish disintegrated PAC-Bayesian generalisation bounds for models trained with gradient descent methods or continuous gradient flows.

Efficient Aggregated Kernel Tests using Incomplete $U$-statistics

4 code implementations18 Jun 2022 Antonin Schrab, Ilmun Kim, Benjamin Guedj, Arthur Gretton

We derive non-asymptotic uniform separation rates for MMDAggInc and HSICAggInc, and quantify exactly the trade-off between computational efficiency and the attainable rates: this result is novel for tests based on incomplete $U$-statistics, to our knowledge.

Computational Efficiency

On Margins and Generalisation for Voting Classifiers

1 code implementation9 Jun 2022 Felix Biggs, Valentina Zantedeschi, Benjamin Guedj

We study the generalisation properties of majority voting on finite ensembles of classifiers, proving margin-based generalisation bounds via the PAC-Bayes theory.

Online PAC-Bayes Learning

no code implementations31 May 2022 Maxime Haddouche, Benjamin Guedj

Most PAC-Bayesian bounds hold in the batch learning setting where data is collected at once, prior to inference or prediction.

Reprint: a randomized extrapolation based on principal components for data augmentation

1 code implementation26 Apr 2022 Jiale Wei, Qiyuan Chen, Pai Peng, Benjamin Guedj, Le Li

This paper presents REPRINT, a simple and effective hidden-space data augmentation method for imbalanced data classification.

Data Augmentation text-classification +1

On PAC-Bayesian reconstruction guarantees for VAEs

no code implementations23 Feb 2022 Badr-Eddine Chérief-Abdellatif, Yuyang Shi, Arnaud Doucet, Benjamin Guedj

Despite its wide use and empirical successes, the theoretical understanding and study of the behaviour and performance of the variational autoencoder (VAE) have only emerged in the past few years.

Controlling Multiple Errors Simultaneously with a PAC-Bayes Bound

no code implementations11 Feb 2022 Reuben Adams, John Shawe-Taylor, Benjamin Guedj

Current PAC-Bayes generalisation bounds are restricted to scalar metrics of performance, such as the loss or error rate.

Classification regression

Measuring dissimilarity with diffeomorphism invariance

1 code implementation11 Feb 2022 Théophile Cantelobre, Carlo Ciliberto, Benjamin Guedj, Alessandro Rudi

Measures of similarity (or dissimilarity) are a key ingredient to many machine learning algorithms.

On change of measure inequalities for $f$-divergences

no code implementations11 Feb 2022 Antoine Picard-Weibel, Benjamin Guedj

We propose new change of measure inequalities based on $f$-divergences (of which the Kullback-Leibler divergence is a particular case).

Non-Vacuous Generalisation Bounds for Shallow Neural Networks

1 code implementation3 Feb 2022 Felix Biggs, Benjamin Guedj

We focus on a specific class of shallow neural networks with a single hidden layer, namely those with $L_2$-normalised data and either a sigmoid-shaped Gaussian error function ("erf") activation or a Gaussian Error Linear Unit (GELU) activation.

KSD Aggregated Goodness-of-fit Test

2 code implementations2 Feb 2022 Antonin Schrab, Benjamin Guedj, Arthur Gretton

KSDAgg avoids splitting the data to perform kernel selection (which leads to a loss in test power), and rather maximises the test power over a collection of kernels.

Progress in Self-Certified Neural Networks

no code implementations15 Nov 2021 Maria Perez-Ortiz, Omar Rivasplata, Emilio Parrado-Hernandez, Benjamin Guedj, John Shawe-Taylor

We then show that in data starvation regimes, holding out data for the test set bounds adversely affects generalisation performance, while self-certified strategies based on PAC-Bayes bounds do not suffer from this drawback, proving that they might be a suitable choice for the small data regime.

valid

Learning PAC-Bayes Priors for Probabilistic Neural Networks

no code implementations21 Sep 2021 Maria Perez-Ortiz, Omar Rivasplata, Benjamin Guedj, Matthew Gleeson, Jingyu Zhang, John Shawe-Taylor, Miroslaw Bober, Josef Kittler

We experiment on 6 datasets with different strategies and amounts of data to learn data-dependent PAC-Bayes priors, and we compare them in terms of their effect on test performance of the learnt predictors and tightness of their risk certificate.

On Margins and Derandomisation in PAC-Bayes

no code implementations8 Jul 2021 Felix Biggs, Benjamin Guedj

We give a general recipe for derandomising PAC-Bayesian bounds using margins, with the critical ingredient being that our randomised predictions concentrate around some value.

Upper and Lower Bounds on the Performance of Kernel PCA

no code implementations18 Dec 2020 Maxime Haddouche, Benjamin Guedj, John Shawe-Taylor

Principal Component Analysis (PCA) is a popular method for dimension reduction and has attracted an unfailing interest for decades.

Dimensionality Reduction

A PAC-Bayesian Perspective on Structured Prediction with Implicit Loss Embeddings

1 code implementation7 Dec 2020 Théophile Cantelobre, Benjamin Guedj, María Pérez-Ortiz, John Shawe-Taylor

Many practical machine learning tasks can be framed as Structured prediction problems, where several output variables are predicted and considered interdependent.

Generalization Bounds Structured Prediction

Cluster-Specific Predictions with Multi-Task Gaussian Processes

1 code implementation16 Nov 2020 Arthur Leroy, Pierre Latouche, Benjamin Guedj, Servane Gey

A variational EM algorithm is derived for dealing with the optimisation of the hyper-parameters along with the hyper-posteriors' estimation of latent variables and processes.

Clustering Gaussian Processes +1

Forecasting elections results via the voter model with stubborn nodes

1 code implementation22 Sep 2020 Antoine Vendeville, Benjamin Guedj, Shi Zhou

We are able to perform time-evolving estimates of the model parameters and use these to forecast the vote shares for each party in any election.

MAGMA: Inference and Prediction with Multi-Task Gaussian Processes

1 code implementation21 Jul 2020 Arthur Leroy, Pierre Latouche, Benjamin Guedj, Servane Gey

A novel multi-task Gaussian process (GP) framework is proposed, by using a common mean process for sharing information across tasks.

Gaussian Processes Time Series +1

PAC-Bayesian Bound for the Conditional Value at Risk

no code implementations NeurIPS 2020 Zakaria Mhammedi, Benjamin Guedj, Robert C. Williamson

Conditional Value at Risk (CVaR) is a family of "coherent risk measures" which generalize the traditional mathematical expectation.

Fairness

Differentiable PAC-Bayes Objectives with Partially Aggregated Neural Networks

no code implementations22 Jun 2020 Felix Biggs, Benjamin Guedj

We make three related contributions motivated by the challenge of training stochastic neural networks, particularly in a PAC-Bayesian setting: (1) we show how averaging over an ensemble of stochastic neural networks enables a new class of \emph{partially-aggregated} estimators; (2) we show that these lead to provably lower-variance gradient estimates for non-differentiable signed-output networks; (3) we reformulate a PAC-Bayesian bound for these networks to derive a directly optimisable, differentiable objective and a generalisation guarantee, without using a surrogate loss or loosening the bound.

Towards control of opinion diversity by introducing zealots into a polarised social group

1 code implementation12 Jun 2020 Antoine Vendeville, Benjamin Guedj, Shi Zhou

We explore a method to influence or even control the diversity of opinions within a polarised social group.

PAC-Bayes unleashed: generalisation bounds with unbounded losses

no code implementations12 Jun 2020 Maxime Haddouche, Benjamin Guedj, Omar Rivasplata, John Shawe-Taylor

We present new PAC-Bayesian generalisation bounds for learning problems with unbounded loss functions.

regression

Kernel-Based Ensemble Learning in Python

1 code implementation17 Dec 2019 Benjamin Guedj, Bhargav Srinivasa Desikan

We propose a new supervised learning algorithm, for classification and regression problems where two or more preliminary predictors are available.

Ensemble Learning General Classification +1

PAC-Bayesian Contrastive Unsupervised Representation Learning

1 code implementation10 Oct 2019 Kento Nozawa, Pascal Germain, Benjamin Guedj

Contrastive unsupervised representation learning (CURL) is the state-of-the-art technique to learn representations (as a set of features) from unlabelled data.

Representation Learning

Still no free lunches: the price to pay for tighter PAC-Bayes bounds

no code implementations10 Oct 2019 Benjamin Guedj, Louis Pujol

"No free lunch" results state the impossibility of obtaining meaningful bounds on the error of a learning algorithm without prior assumptions and modelling.

Online k-means Clustering

no code implementations15 Sep 2019 Vincent Cohen-Addad, Benjamin Guedj, Varun Kanade, Guy Rom

The specific formulation we use is the $k$-means objective: At each time step the algorithm has to maintain a set of k candidate centers and the loss incurred is the squared distance between the new point and the closest center.

Clustering Online Clustering

Model Validation Using Mutated Training Labels: An Exploratory Study

no code implementations24 May 2019 Jie M. Zhang, Mark Harman, Benjamin Guedj, Earl T. Barr, John Shawe-Taylor

MV mutates training data labels, retrains the model against the mutated data, then uses the metamorphic relation that captures the consequent training performance changes to assess model fit.

BIG-bench Machine Learning General Classification +1

Non-linear aggregation of filters to improve image denoising

2 code implementations1 Apr 2019 Benjamin Guedj, Juliette Rengot

We introduce a novel aggregation method to efficiently perform image denoising.

Image Denoising

Revisiting clustering as matrix factorisation on the Stiefel manifold

no code implementations11 Mar 2019 Stéphane Chrétien, Benjamin Guedj

This paper studies clustering for possibly high dimensional data (e. g. images, time series, gene expression data, and many other settings), and rephrase it as low rank matrix estimation in the PAC-Bayesian framework.

Clustering Time Series +1

A Primer on PAC-Bayesian Learning

no code implementations16 Jan 2019 Benjamin Guedj

Generalised Bayesian learning algorithms are increasingly popular in machine learning, due to their PAC generalisation properties and flexibility.

BIG-bench Machine Learning

Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly

no code implementations18 May 2018 Benjamin Guedj, Le Li

When confronted with massive data streams, summarizing data with dimension reduction methods such as PCA raises theoretical and algorithmic pitfalls.

Dimensionality Reduction

Pycobra: A Python Toolbox for Ensemble Learning and Visualisation

1 code implementation25 Apr 2017 Benjamin Guedj, Bhargav Srinivasa Desikan

We introduce \texttt{pycobra}, a Python library devoted to ensemble learning (regression and classification) and visualisation.

BIG-bench Machine Learning Ensemble Learning +1

Simpler PAC-Bayesian Bounds for Hostile Data

no code implementations23 Oct 2016 Pierre Alquier, Benjamin Guedj

In these bounds the Kullack-Leibler divergence is replaced with a general version of Csisz\'ar's $f$-divergence.

Stability revisited: new generalisation bounds for the Leave-one-Out

no code implementations23 Aug 2016 Alain Celisse, Benjamin Guedj

The present paper provides a new generic strategy leading to non-asymptotic theoretical guarantees on the Leave-one-Out procedure applied to a broad class of learning algorithms.

regression

A Quasi-Bayesian Perspective to Online Clustering

no code implementations1 Feb 2016 Le Li, Benjamin Guedj, Sébastien Loustau

When faced with high frequency streams of data, clustering raises theoretical and algorithmic pitfalls.

Clustering Online Clustering

An Oracle Inequality for Quasi-Bayesian Non-Negative Matrix Factorization

no code implementations6 Jan 2016 Pierre Alquier, Benjamin Guedj

The aim of this paper is to provide some theoretical understanding of quasi-Bayesian aggregation methods non-negative matrix factorization.

PAC-Bayesian High Dimensional Bipartite Ranking

no code implementations9 Nov 2015 Benjamin Guedj, Sylvain Robbiano

This paper is devoted to the bipartite ranking problem, a classical statistical learning task, in a high dimensional setting.

Vocal Bursts Intensity Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.