no code implementations • 31 Aug 2022 • Olivier Bousquet, Steve Hanneke, Shay Moran, Jonathan Shafer, Ilya Tolstikhin
We solve this problem in a principled manner, by introducing a combinatorial dimension called VCL that characterizes the best $d'$ for which $d'/n$ is a strong minimax lower bound.
no code implementations • 3 Jul 2021 • Ibrahim Alabdulmohsin, Larisa Markeeva, Daniel Keysers, Ilya Tolstikhin
We introduce a generalization to the lottery ticket hypothesis in which the notion of "sparsity" is relaxed by choosing an arbitrary basis in the space of parameters.
49 code implementations • NeurIPS 2021 • Ilya Tolstikhin, Neil Houlsby, Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Thomas Unterthiner, Jessica Yung, Andreas Steiner, Daniel Keysers, Jakob Uszkoreit, Mario Lucic, Alexey Dosovitskiy
Convolutional Neural Networks (CNNs) are the go-to model for computer vision.
Ranked #17 on Image Classification on OmniBenchmark
no code implementations • NeurIPS 2020 • Hartmut Maennel, Ibrahim Alabdulmohsin, Ilya Tolstikhin, Robert J. N. Baldock, Olivier Bousquet, Sylvain Gelly, Daniel Keysers
We show how this alignment produces a positive transfer: networks pre-trained with random labels train faster downstream compared to training from scratch even after accounting for simple effects, such as weight scaling.
1 code implementation • 26 Feb 2020 • Thomas Unterthiner, Daniel Keysers, Sylvain Gelly, Olivier Bousquet, Ilya Tolstikhin
Furthermore, the predictors are able to rank networks trained on different, unobserved datasets and with different architectures.
no code implementations • 28 May 2019 • Christina Göpfert, Shai Ben-David, Olivier Bousquet, Sylvain Gelly, Ilya Tolstikhin, Ruth Urner
In semi-supervised classification, one is given access both to labeled and unlabeled data.
1 code implementation • NeurIPS 2019 • Paul K. Rubenstein, Olivier Bousquet, Josip Djolonga, Carlos Riquelme, Ilya Tolstikhin
The estimation of an f-divergence between two probability distributions based on samples is a fundamental problem in statistics and machine learning.
5 code implementations • 30 Jan 2019 • Mateo Rojas-Carulla, Ilya Tolstikhin, Guillermo Luque, Nicholas Youngblut, Ruth Ley, Bernhard Schölkopf
We introduce GeNet, a method for shotgun metagenomic classification from raw DNA sequences that exploits the known hierarchical structure between labels for training.
no code implementations • 30 Apr 2018 • Francesco Locatello, Damien Vincent, Ilya Tolstikhin, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf
A common assumption in causal modeling posits that the data is generated by a set of independent mechanisms, and algorithms should aim to recover this structure.
no code implementations • 11 Feb 2018 • Paul K. Rubenstein, Bernhard Schoelkopf, Ilya Tolstikhin
We study the role of latent space dimensionality in Wasserstein auto-encoders (WAEs).
14 code implementations • ICLR 2018 • Ilya Tolstikhin, Olivier Bousquet, Sylvain Gelly, Bernhard Schoelkopf
We propose the Wasserstein Auto-Encoder (WAE)---a new algorithm for building a generative model of the data distribution.
1 code implementation • ICML 2018 • Matej Balog, Ilya Tolstikhin, Bernhard Schölkopf
First, releasing (an estimate of) the kernel mean embedding of the data generating random variable instead of the database itself still allows third-parties to construct consistent estimators of a wide class of population statistics.
no code implementations • 30 Jun 2017 • Paul K. Rubenstein, Ilya Tolstikhin, Philipp Hennig, Bernhard Schoelkopf
We consider the problem of learning the functions computing children from parents in a Structural Causal Model once the underlying causal graph has been identified.
1 code implementation • 22 May 2017 • Olivier Bousquet, Sylvain Gelly, Ilya Tolstikhin, Carl-Johann Simon-Gabriel, Bernhard Schoelkopf
We study unsupervised generative modeling in terms of the optimal transport (OT) problem between true (but unknown) data distribution $P_X$ and the latent variable model distribution $P_G$.
1 code implementation • NeurIPS 2017 • Ilya Tolstikhin, Sylvain Gelly, Olivier Bousquet, Carl-Johann Simon-Gabriel, Bernhard Schölkopf
Generative Adversarial Networks (GAN) (Goodfellow et al., 2014) are an effective method for training generative models of complex data such as natural images.
no code implementations • NeurIPS 2016 • Carl-Johann Simon-Gabriel, Adam Ścibior, Ilya Tolstikhin, Bernhard Schölkopf
We provide a theoretical foundation for non-parametric estimation of functions of random variables using kernel mean embeddings.
no code implementations • 9 Feb 2016 • Ilya Tolstikhin, David Lopez-Paz
Transductive learning considers a training set of $m$ labeled samples and a test set of $u$ unlabeled samples, with the goal of best labeling that particular test set.
no code implementations • 12 May 2015 • Ilya Tolstikhin, Nikita Zhivotovskiy, Gilles Blanchard
This paper introduces a new complexity measure for transductive learning called Permutational Rademacher Complexity (PRC) and studies its properties.
1 code implementation • 9 Feb 2015 • David Lopez-Paz, Krikamol Muandet, Bernhard Schölkopf, Ilya Tolstikhin
We pose causal inference as the problem of learning to classify probability distributions.
no code implementations • 26 Nov 2014 • Ilya Tolstikhin, Gilles Blanchard, Marius Kloft
We show two novel concentration inequalities for suprema of empirical processes when sampling without replacement, which both take the variance of the functions into account.