no code implementations • ICML 2020 • Riccardo Grazzi, Saverio Salzo, Massimiliano Pontil, Luca Franceschi
We study a general class of bilevel optimization problems, in which the upper-level objective is defined via the solution of a fixed point equation.
no code implementations • 30 May 2025 • Fabio Fehr, Prabhu Teja Sivaprasad, Luca Franceschi, Giovanni Zappella
In this paper, we introduce CoRet, a dense retrieval model designed for code-editing tasks that integrates code semantics, repository structure, and call graph dependencies.
no code implementations • 30 Oct 2024 • Luca Franceschi, Michele Donini, Valerio Perrone, Aaron Klein, Cédric Archambeau, Matthias Seeger, Massimiliano Pontil, Paolo Frasconi
Hyperparameters are configuration variables controlling the behavior of machine learning algorithms.
no code implementations • 8 Oct 2024 • Yihong Chen, Xiangxiang Xu, Yao Lu, Pontus Stenetorp, Luca Franceschi
We introduce a framework for expanding residual computational graphs using jets, operators that generalize truncated Taylor series.
1 code implementation • 15 Jul 2024 • Pola Schwöbel, Luca Franceschi, Muhammad Bilal Zafar, Keerthan Vasist, Aman Malhotra, Tomer Shenhar, Pinal Tailor, Pinar Yilmaz, Michael Diamond, Michele Donini
A case study demonstrates a typical use case for the library: picking a suitable model for a question answering task.
1 code implementation • 15 Feb 2024 • Luca Franceschi, Michele Donini, Cédric Archambeau, Matthias Seeger
We argue that often there is a critical mismatch between what one wishes to explain (e. g. the output of a classifier) and what current methods such as SHAP explain (e. g. the scalar probability of a class).
1 code implementation • 27 Jan 2023 • Valentina Zantedeschi, Luca Franceschi, Jean Kaddour, Matt J. Kusner, Vlad Niculae
We propose a continuous optimization framework for discovering a latent directed acyclic graph (DAG) from observational data.
no code implementations • 27 Oct 2022 • Andrew J. Wren, Pasquale Minervini, Luca Franceschi, Valentina Zantedeschi
Recently continuous relaxations have been proposed in order to learn Directed Acyclic Graphs (DAGs) from data by backpropagation, instead of using combinatorial optimization.
1 code implementation • 11 Sep 2022 • Pasquale Minervini, Luca Franceschi, Mathias Niepert
In this work, we present Adaptive IMLE (AIMLE), the first adaptive gradient estimator for complex discrete distributions: it adaptively identifies the target distribution for IMLE by trading off the density of gradient information with the degree of bias in the gradient estimates.
1 code implementation • 20 Jul 2022 • Yihong Chen, Pushkar Mishra, Luca Franceschi, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel
Factorisation-based Models (FMs), such as DistMult, have enjoyed enduring success for Knowledge Graph Completion (KGC) tasks, often outperforming Graph Neural Networks (GNNs).
3 code implementations • NeurIPS 2021 • Mathias Niepert, Pasquale Minervini, Luca Franceschi
We propose Implicit Maximum Likelihood Estimation (I-MLE), a framework for end-to-end learning of models combining discrete exponential family distributions and differentiable neural components.
1 code implementation • 29 Jun 2020 • Riccardo Grazzi, Luca Franceschi, Massimiliano Pontil, Saverio Salzo
We study a general class of bilevel problems, consisting in the minimization of an upper-level objective which depends on the solution to a parametric fixed-point equation.
1 code implementation • 18 Oct 2019 • Michele Donini, Luca Franceschi, Massimiliano Pontil, Orchid Majumder, Paolo Frasconi
We study the problem of fitting task-specific learning rate schedules from the perspective of hyperparameter optimization, aiming at good generalization.
no code implementations • 25 Sep 2019 • Michele Donini, Luca Franceschi, Orchid Majumder, Massimiliano Pontil, Paolo Frasconi
We study the problem of fitting task-specific learning rate schedules from the perspective of hyperparameter optimization.
2 code implementations • 28 Mar 2019 • Luca Franceschi, Mathias Niepert, Massimiliano Pontil, Xiao He
With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph.
Ranked #3 on
Node Classification
on Cora: fixed 20 node per class
2 code implementations • 13 Jun 2018 • Luca Franceschi, Riccardo Grazzi, Massimiliano Pontil, Saverio Salzo, Paolo Frasconi
In (Franceschi et al., 2018) we proposed a unified mathematical framework, grounded on bilevel programming, that encompasses gradient-based hyperparameter optimization and meta-learning.
no code implementations • ICML 2018 • Luca Franceschi, Paolo Frasconi, Saverio Salzo, Riccardo Grazzi, Massimilano Pontil
We introduce a framework based on bilevel programming that unifies gradient-based hyperparameter optimization and meta-learning.
1 code implementation • 18 Dec 2017 • Luca Franceschi, Michele Donini, Paolo Frasconi, Massimiliano Pontil
We consider a class of a nested optimization problems involving inner and outer objectives.
3 code implementations • ICML 2017 • Luca Franceschi, Michele Donini, Paolo Frasconi, Massimiliano Pontil
We study two procedures (reverse-mode and forward-mode) for computing the gradient of the validation error with respect to the hyperparameters of any iterative learning algorithm such as stochastic gradient descent.