Search Results for author: Massimiliano Pontil

Found 93 papers, 29 papers with code

On the Iteration Complexity of Hypergradient Computations

no code implementations ICML 2020 Riccardo Grazzi, Saverio Salzo, Massimiliano Pontil, Luca Franceschi

We study a general class of bilevel optimization problems, in which the upper-level objective is defined via the solution of a fixed point equation.

Bilevel Optimization Computational Efficiency +1

Neural Conditional Probability for Inference

no code implementations1 Jul 2024 Vladimir R. Kostic, Karim Lounici, Gregoire Pacreau, Pietro Novelli, Giacomo Turri, Massimiliano Pontil

We introduce NCP (Neural Conditional Probability), a novel operator-theoretic approach for learning conditional distributions with a particular focus on inference tasks.

Operator World Models for Reinforcement Learning

no code implementations28 Jun 2024 Pietro Novelli, Marco Pratticò, Massimiliano Pontil, Carlo Ciliberto

Policy Mirror Descent (PMD) is a powerful and theoretically sound methodology for sequential decision-making.

Decision Making reinforcement-learning +1

From Biased to Unbiased Dynamics: An Infinitesimal Generator Approach

no code implementations13 Jun 2024 Timothée Devergne, Vladimir Kostic, Michele Parrinello, Massimiliano Pontil

We contrast our approach to more common ones based on the transfer operator, showing that it can provably learn the spectral properties of the unbiased system from biased data.

Contextual Continuum Bandits: Static Versus Dynamic Regret

no code implementations9 Jun 2024 Arya Akhavan, Karim Lounici, Massimiliano Pontil, Alexandre B. Tsybakov

We study the contextual continuum bandits problem, where the learner sequentially receives a side information vector and has to choose an action in a convex set, minimizing a function associated to the context.

Learning the Infinitesimal Generator of Stochastic Diffusion Processes

no code implementations21 May 2024 Vladimir R. Kostic, Karim Lounici, Helene Halconruy, Timothee Devergne, Massimiliano Pontil

Additionally, we elucidate how the distortion between the intrinsic energy-induced metric of the stochastic diffusion and the RKHS metric used for generator estimation impacts the spectral learning bounds.

Nonsmooth Implicit Differentiation: Deterministic and Stochastic Convergence Rates

1 code implementation18 Mar 2024 Riccardo Grazzi, Massimiliano Pontil, Saverio Salzo

In the deterministic case, we provide a linear rate for AID and an improved linear rate for ITD which closely match the ones for the smooth setting.

Data Poisoning Hyperparameter Optimization +1

A randomized algorithm to solve reduced rank operator regression

1 code implementation28 Dec 2023 Giacomo Turri, Vladimir Kostic, Pietro Novelli, Massimiliano Pontil

We present and analyze an algorithm designed for addressing vector-valued regression problems involving possibly infinite-dimensional input and output spaces.

regression

Consistent Long-Term Forecasting of Ergodic Dynamical Systems

no code implementations20 Dec 2023 Prune Inzerilli, Vladimir Kostic, Karim Lounici, Pietro Novelli, Massimiliano Pontil

We study the evolution of distributions under the action of an ergodic dynamical system, which may be stochastic in nature.

Dynamics Harmonic Analysis of Robotic Systems: Application in Data-Driven Koopman Modelling

1 code implementation12 Dec 2023 Daniel Ordoñez-Apraez, Vladimir Kostic, Giulio Turrisi, Pietro Novelli, Carlos Mastalli, Claudio Semini, Massimiliano Pontil

We introduce the use of harmonic analysis to decompose the state space of symmetric robotic systems into orthogonal isotypic subspaces.

Learning invariant representations of time-homogeneous stochastic dynamical systems

1 code implementation19 Jul 2023 Vladimir R. Kostic, Pietro Novelli, Riccardo Grazzi, Karim Lounici, Massimiliano Pontil

We consider the general class of time-homogeneous stochastic dynamical systems, both discrete and continuous, and study the problem of learning a representation of the state that faithfully captures its dynamics.

Learning Theory

Gradient-free optimization of highly smooth functions: improved analysis and a new algorithm

no code implementations3 Jun 2023 Arya Akhavan, Evgenii Chzhen, Massimiliano Pontil, Alexandre B. Tsybakov

The first algorithm uses a gradient estimator based on randomization over the $\ell_2$ sphere due to Bach and Perchet (2016).

Robust Meta-Representation Learning via Global Label Inference and Classification

1 code implementation22 Dec 2022 Ruohan Wang, Isak Falk, Massimiliano Pontil, Carlo Ciliberto

Empirically, MeLa outperforms existing methods across a diverse range of benchmarks, in particular under a more challenging setting where the number of training tasks is limited and labels are task-specific.

Few-Shot Learning Representation Learning

Group Meritocratic Fairness in Linear Contextual Bandits

1 code implementation7 Jun 2022 Riccardo Grazzi, Arya Akhavan, John Isak Texas Falk, Leonardo Cella, Massimiliano Pontil

This is a very strong notion of fairness, since the relative rank is not directly observed by the agent and depends on the underlying reward model and on the distribution of rewards.

Fairness Multi-Armed Bandits

Meta Representation Learning with Contextual Linear Bandits

no code implementations30 May 2022 Leonardo Cella, Karim Lounici, Massimiliano Pontil

We aim to leverage this information in order to learn a new downstream bandit task, which shares the same representation.

Meta-Learning Representation Learning

A gradient estimator via L1-randomization for online zero-order optimization with two point feedback

no code implementations27 May 2022 Arya Akhavan, Evgenii Chzhen, Massimiliano Pontil, Alexandre B. Tsybakov

We present a novel gradient estimator based on two function evaluations and randomization on the $\ell_1$-sphere.

Distribution Regression with Sliced Wasserstein Kernels

1 code implementation8 Feb 2022 Dimitri Meunier, Massimiliano Pontil, Carlo Ciliberto

We study the theoretical properties of a kernel ridge regression estimator based on such representation, for which we prove universal consistency and excess risk bounds.

regression

Bilevel Optimization with a Lower-level Contraction: Optimal Sample Complexity without Warm-start

2 code implementations NeurIPS 2023 Riccardo Grazzi, Massimiliano Pontil, Saverio Salzo

We analyse a general class of bilevel problems, in which the upper-level problem consists in the minimization of a smooth objective function and the lower-level problem is to find the fixed point of a smooth contraction map.

Bilevel Optimization Data Poisoning +2

A Gang of Adversarial Bandits

no code implementations NeurIPS 2021 Mark Herbster, Stephen Pasteris, Fabio Vitale, Massimiliano Pontil

Users are in a social network and the learner is aided by a-priori knowledge of the strengths of the social links between all pairs of users.

Recommendation Systems

Concentration inequalities under sub-Gaussian and sub-exponential conditions

no code implementations NeurIPS 2021 Andreas Maurer, Massimiliano Pontil

We prove analogues of the popular bounded difference inequality (also called McDiarmid's inequality) for functions of independent random variables under sub-gaussian and sub-exponential conditions.

regression

The Role of Global Labels in Few-Shot Classification and How to Infer Them

no code implementations NeurIPS 2021 Ruohan Wang, Massimiliano Pontil, Carlo Ciliberto

Few-shot learning is a central problem in meta-learning, where learners must quickly adapt to new tasks given limited training data.

Few-Shot Learning

Multitask Online Mirror Descent

no code implementations NeurIPS 2021 Nicolò Cesa-Bianchi, Pierre Laforgue, Andrea Paudice, Massimiliano Pontil

We introduce and analyze MT-OMD, a multitask generalization of Online Mirror Descent (OMD) which operates by sharing updates between tasks.

Conditional Meta-Learning of Linear Representations

no code implementations30 Mar 2021 Giulia Denevi, Massimiliano Pontil, Carlo Ciliberto

Standard meta-learning for representation learning aims to find a common representation to be shared across multiple tasks.

Meta-Learning Representation Learning

Some Hoeffding- and Bernstein-type Concentration Inequalities

no code implementations11 Feb 2021 Andreas Maurer, Massimiliano Pontil

We prove concentration inequalities for functions of independent random variables {under} sub-gaussian and sub-exponential conditions.

Vocal Bursts Type Prediction

Distributed Zero-Order Optimization under Adversarial Noise

no code implementations NeurIPS 2021 Arya Akhavan, Massimiliano Pontil, Alexandre B. Tsybakov

We study the problem of distributed zero-order optimization for a class of strongly convex functions.

Optimization and Control Statistics Theory Statistics Theory

Robust Unsupervised Learning via L-Statistic Minimization

no code implementations14 Dec 2020 Andreas Maurer, Daniela A. Parletta, Andrea Paudice, Massimiliano Pontil

Designing learning algorithms that are resistant to perturbations of the underlying data distribution is a problem of wide practical and theoretical importance.

Clustering

Online Model Selection: a Rested Bandit Formulation

no code implementations7 Dec 2020 Leonardo Cella, Claudio Gentile, Massimiliano Pontil

Unlike known model selection efforts in the recent bandit literature, our algorithm exploits the specific structure of the problem to learn the unknown parameters of the expected loss function so as to identify the best arm as quickly as possible.

Model Selection

The Advantage of Conditional Meta-Learning for Biased Regularization and Fine Tuning

1 code implementation NeurIPS 2020 Giulia Denevi, Massimiliano Pontil, Carlo Ciliberto

However, these methods may perform poorly on heterogeneous environments of tasks, where the complexity of the tasks’ distribution cannot be captured by a single meta- parameter vector.

Meta-Learning

Estimating weighted areas under the ROC curve

no code implementations NeurIPS 2020 Andreas Maurer, Massimiliano Pontil

Exponential bounds on the estimation error are given for the plug-in estimator of weighted areas under the ROC curve.

Convergence Properties of Stochastic Hypergradients

no code implementations13 Nov 2020 Riccardo Grazzi, Massimiliano Pontil, Saverio Salzo

Bilevel optimization problems are receiving increasing attention in machine learning as they provide a natural framework for hyperparameter optimization and meta-learning.

Bilevel Optimization Hyperparameter Optimization +1

The Advantage of Conditional Meta-Learning for Biased Regularization and Fine-Tuning

no code implementations25 Aug 2020 Giulia Denevi, Massimiliano Pontil, Carlo Ciliberto

However, these methods may perform poorly on heterogeneous environments of tasks, where the complexity of the tasks' distribution cannot be captured by a single meta-parameter vector.

Meta-Learning

Generalization Properties of Optimal Transport GANs with Latent Distribution Learning

no code implementations29 Jul 2020 Giulia Luise, Massimiliano Pontil, Carlo Ciliberto

The Generative Adversarial Networks (GAN) framework is a well-established paradigm for probability matching and realistic sample generation.

Online Parameter-Free Learning of Multiple Low Variance Tasks

1 code implementation11 Jul 2020 Giulia Denevi, Dimitris Stamos, Massimiliano Pontil

We propose a method to learn a common bias vector for a growing sequence of low-variance tasks.

Meta-Learning Multi-Task Learning

On the Iteration Complexity of Hypergradient Computation

1 code implementation29 Jun 2020 Riccardo Grazzi, Luca Franceschi, Massimiliano Pontil, Saverio Salzo

We study a general class of bilevel problems, consisting in the minimization of an upper-level objective which depends on the solution to a parametric fixed-point equation.

Computational Efficiency Hyperparameter Optimization +1

Multi-source Domain Adaptation via Weighted Joint Distributions Optimal Transport

1 code implementation23 Jun 2020 Rosanna Turrisi, Rémi Flamary, Alain Rakotomamonjy, Massimiliano Pontil

The problem of domain adaptation on an unlabeled target dataset using knowledge from multiple labelled source datasets is becoming increasingly important.

Diversity Domain Adaptation

Meta-learning with Stochastic Linear Bandits

no code implementations ICML 2020 Leonardo Cella, Alessandro Lazaric, Massimiliano Pontil

The goal is to select a learning algorithm which works well on average over a class of bandits tasks, that are sampled from a task-distribution.

Meta-Learning

Efficient Tensor Kernel methods for sparse regression

no code implementations23 Mar 2020 Feliks Hibraj, Marcello Pelillo, Saverio Salzo, Massimiliano Pontil

Second, we use a Nystrom-type subsampling approach, which allows for a training phase with a smaller number of data points, so to reduce the computational cost.

regression

Distance-Based Regularisation of Deep Networks for Fine-Tuning

1 code implementation ICLR 2021 Henry Gouk, Timothy M. Hospedales, Massimiliano Pontil

Our bound is highly relevant for fine-tuning, because providing a network with a good initialisation based on transfer learning means that learning can modify the weights less, and hence achieve tighter generalisation.

Transfer Learning

Online-Within-Online Meta-Learning

1 code implementation NeurIPS 2019 Giulia Denevi, Dimitris Stamos, Carlo Ciliberto, Massimiliano Pontil

We study the problem of learning a series of tasks in a fully online Meta-Learning setting.

Meta-Learning

MARTHE: Scheduling the Learning Rate Via Online Hypergradients

1 code implementation18 Oct 2019 Michele Donini, Luca Franceschi, Massimiliano Pontil, Orchid Majumder, Paolo Frasconi

We study the problem of fitting task-specific learning rate schedules from the perspective of hyperparameter optimization, aiming at good generalization.

Hyperparameter Optimization Scheduling

Learning Fair and Transferable Representations

no code implementations NeurIPS 2020 Luca Oneto, Michele Donini, Andreas Maurer, Massimiliano Pontil

Developing learning methods which do not discriminate subgroups in the population is a central goal of algorithmic fairness.

Fairness

Sinkhorn Barycenters with Free Support via Frank-Wolfe Algorithm

1 code implementation NeurIPS 2019 Giulia Luise, Saverio Salzo, Massimiliano Pontil, Carlo Ciliberto

We present a novel algorithm to estimate the barycenter of arbitrary probability distributions with respect to the Sinkhorn divergence.

Learning Discrete Structures for Graph Neural Networks

2 code implementations28 Mar 2019 Luca Franceschi, Mathias Niepert, Massimiliano Pontil, Xiao He

With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph.

Music Genre Recognition Node Classification

Learning-to-Learn Stochastic Gradient Descent with Biased Regularization

1 code implementation25 Mar 2019 Giulia Denevi, Carlo Ciliberto, Riccardo Grazzi, Massimiliano Pontil

We study the problem of learning-to-learn: inferring a learning algorithm that works well on tasks sampled from an unknown distribution.

Leveraging Low-Rank Relations Between Surrogate Tasks in Structured Prediction

no code implementations2 Mar 2019 Giulia Luise, Dimitris Stamos, Massimiliano Pontil, Carlo Ciliberto

We study the interplay between surrogate methods for structured prediction and techniques from multitask learning designed to leverage relationships between surrogate outputs.

Structured Prediction

Uniform concentration and symmetrization for weak interactions

no code implementations5 Feb 2019 Andreas Maurer, Massimiliano Pontil

The method to derive uniform bounds with Gaussian and Rademacher complexities is extended to the case where the sample average is replaced by a nonlinear statistic.

General Fair Empirical Risk Minimization

no code implementations29 Jan 2019 Luca Oneto, Michele Donini, Massimiliano Pontil

We tackle the problem of algorithmic fairness, where the goal is to avoid the unfairly influence of sensitive information, in the general context of regression with possible continuous sensitive attributes.

Fairness regression

Learning To Learn Around A Common Mean

no code implementations NeurIPS 2018 Giulia Denevi, Carlo Ciliberto, Dimitris Stamos, Massimiliano Pontil

We show that, in this setting, the LTL problem can be reformulated as a Least Squares (LS) problem and we exploit a novel meta- algorithm to efficiently solve it.

Meta-Learning

Bilevel learning of the Group Lasso structure

no code implementations NeurIPS 2018 Jordan Frecon, Saverio Salzo, Massimiliano Pontil

Regression with group-sparsity penalty plays a central role in high-dimensional prediction problems.

Bilevel Optimization

Taking Advantage of Multitask Learning for Fair Classification

no code implementations19 Oct 2018 Luca Oneto, Michele Donini, Amon Elders, Massimiliano Pontil

In this paper we show how it is possible to get the best of both worlds: optimize model accuracy and fairness without explicitly using the sensitive feature in the functional form of the model, thereby treating different individuals equally.

Classification Decision Making +2

Far-HO: A Bilevel Programming Package for Hyperparameter Optimization and Meta-Learning

2 code implementations13 Jun 2018 Luca Franceschi, Riccardo Grazzi, Massimiliano Pontil, Saverio Salzo, Paolo Frasconi

In (Franceschi et al., 2018) we proposed a unified mathematical framework, grounded on bilevel programming, that encompasses gradient-based hyperparameter optimization and meta-learning.

Hyperparameter Optimization Meta-Learning

Differential Properties of Sinkhorn Approximation for Learning with Wasserstein Distance

2 code implementations NeurIPS 2018 Giulia Luise, Alessandro Rudi, Massimiliano Pontil, Carlo Ciliberto

Applications of optimal transport have recently gained remarkable attention thanks to the computational advantages of entropic regularization.

Approximating Hamiltonian dynamics with the Nyström method

no code implementations6 Apr 2018 Alessandro Rudi, Leonard Wossnig, Carlo Ciliberto, Andrea Rocchetto, Massimiliano Pontil, Simone Severini

Simulating the time-evolution of quantum mechanical systems is BQP-hard and expected to be one of the foremost applications of quantum computers.

Incremental Learning-to-Learn with Statistical Guarantees

no code implementations21 Mar 2018 Giulia Denevi, Carlo Ciliberto, Dimitris Stamos, Massimiliano Pontil

In learning-to-learn the goal is to infer a learning algorithm that works well on a class of tasks sampled from an unknown meta distribution.

Incremental Learning regression

Empirical bounds for functions with weak interactions

no code implementations11 Mar 2018 Andreas Maurer, Massimiliano Pontil

We provide sharp empirical estimates of expectation, variance and normal approximation for a class of statistics whose variation in any argument does not change too much when another argument is modified.

Empirical Risk Minimization under Fairness Constraints

2 code implementations NeurIPS 2018 Michele Donini, Luca Oneto, Shai Ben-David, John Shawe-Taylor, Massimiliano Pontil

It encourages the conditional risk of the learned classifier to be approximately constant with respect to the sensitive variable.

Fairness

Quantum machine learning: a classical perspective

no code implementations26 Jul 2017 Carlo Ciliberto, Mark Herbster, Alessandro Davide Ialongo, Massimiliano Pontil, Andrea Rocchetto, Simone Severini, Leonard Wossnig

Recently, increased computational power and data availability, as well as algorithmic advances, have led machine learning techniques to impressive results in regression, classification, data-generation and reinforcement learning tasks.

BIG-bench Machine Learning Quantum Machine Learning

Reexamining Low Rank Matrix Factorization for Trace Norm Regularization

no code implementations27 Jun 2017 Carlo Ciliberto, Dimitris Stamos, Massimiliano Pontil

A standard optimization strategy is based on formulating the problem as one of low rank matrix factorization which, however, leads to a non-convex problem.

Matrix Completion

Consistent Multitask Learning with Nonlinear Output Relations

no code implementations NeurIPS 2017 Carlo Ciliberto, Alessandro Rudi, Lorenzo Rosasco, Massimiliano Pontil

However, in practice assuming the tasks to be linearly related might be restrictive, and allowing for nonlinear structures is a challenge.

Structured Prediction

Forward and Reverse Gradient-Based Hyperparameter Optimization

2 code implementations ICML 2017 Luca Franceschi, Michele Donini, Paolo Frasconi, Massimiliano Pontil

We study two procedures (reverse-mode and forward-mode) for computing the gradient of the validation error with respect to the hyperparameters of any iterative learning algorithm such as stochastic gradient descent.

Hyperparameter Optimization

Bounds for Vector-Valued Function Estimation

no code implementations5 Jun 2016 Andreas Maurer, Massimiliano Pontil

Multi-task learning and one-vs-all multi-category learning are treated as examples.

Multi-Task Learning

Unsupervised Cross-Dataset Transfer Learning for Person Re-Identification

no code implementations CVPR 2016 Peixi Peng, Tao Xiang, Yao-Wei Wang, Massimiliano Pontil, Shaogang Gong, Tiejun Huang, Yonghong Tian

Most existing person re-identification (Re-ID) approaches follow a supervised learning framework, in which a large number of labelled matching pairs are required for training.

Dictionary Learning Person Re-Identification +1

Fitting Spectral Decay with the $k$-Support Norm

no code implementations4 Jan 2016 Andrew M. McDonald, Massimiliano Pontil, Dimitris Stamos

The spectral $k$-support norm enjoys good estimation properties in low rank matrix learning problems, empirically outperforming the trace norm.

Matrix Completion

New Perspectives on $k$-Support and Cluster Norms

no code implementations27 Dec 2015 Andrew M. McDonald, Massimiliano Pontil, Dimitris Stamos

We note that the spectral box-norm is essentially equivalent to the cluster norm, a multitask learning regularizer introduced by [Jacob et al. 2009a], and which in turn can be interpreted as a perturbation of the spectral k-support norm.

Matrix Completion

Learning With Dataset Bias in Latent Subcategory Models

no code implementations CVPR 2015 Dimitris Stamos, Samuele Martelli, Moin Nabi, Andrew McDonald, Vittorio Murino, Massimiliano Pontil

However, previous work has highlighted the possible danger of simply training a model from the combined datasets, due to the presence of bias.

The Benefit of Multitask Representation Learning

no code implementations23 May 2015 Andreas Maurer, Massimiliano Pontil, Bernardino Romera-Paredes

In particular, focusing on the important example of half-space learning, we derive the regime in which multitask representation learning is beneficial over independent task learning, as a function of the sample size, the number of tasks and the intrinsic data dimensionality.

Representation Learning

New Perspectives on k-Support and Cluster Norms

no code implementations6 Mar 2014 Andrew M. McDonald, Massimiliano Pontil, Dimitris Stamos

We further extend the $k$-support norm to matrices, and we observe that it is a special case of the matrix cluster norm.

Matrix Completion

Sparse coding for multitask and transfer learning

no code implementations4 Sep 2012 Andreas Maurer, Massimiliano Pontil, Bernardino Romera-Paredes

We investigate the use of sparse coding and dictionary learning in the context of multitask and transfer learning.

Dictionary Learning Transfer Learning

A Family of Penalty Functions for Structured Sparsity

no code implementations NeurIPS 2010 Jean Morales, Charles A. Micchelli, Massimiliano Pontil

We study the problem of learning a sparse linear regression vector under additional conditions on the structure of its sparsity pattern.

regression

Fast Prediction on a Tree

no code implementations NeurIPS 2008 Mark Herbster, Massimiliano Pontil, Sergio R. Galeano

Given an $n$-vertex weighted tree with structural diameter $S$ and a subset of $m$ vertices, we present a technique to compute a corresponding $m \times m$ Gram matrix of the pseudoinverse of the graph Laplacian in $O(n+ m^2 + m S)$ time.

Online Prediction on Large Diameter Graphs

no code implementations NeurIPS 2008 Mark Herbster, Guy Lever, Massimiliano Pontil

Current on-line learning algorithms for predicting the labelling of a graph have an important limitation in the case of large diameter graphs; the number of mistakes made by such algorithms may be proportional to the square root of the number of vertices, even when tackling simple problems.

Cannot find the paper you are looking for? You can Submit a new open access paper.