Search Results for author: Danica J. Sutherland

Found 22 papers, 11 papers with code

One Weird Trick to Improve Your Semi-Weakly Supervised Semantic Segmentation Model

no code implementations2 May 2022 Wonho Bae, Junhyug Noh, Milad Jalali Asadabadi, Danica J. Sutherland

Semi-weakly supervised semantic segmentation (SWSSS) aims to train a model to identify objects in images based on a small number of images with pixel-level labels, and many more images with only image-level labels.

Weakly-Supervised Semantic Segmentation

Better Supervisory Signals by Observing Learning Paths

1 code implementation ICLR 2022 Yi Ren, Shangmin Guo, Danica J. Sutherland

Observing the learning path not only provides a new perspective for understanding knowledge distillation, overfitting, and learning dynamics, but also reveals that the supervisory signal of a teacher network can be very unstable near the best points in training on real tasks.

Knowledge Distillation

Optimistic Rates: A Unifying Theory for Interpolation Learning and Regularization in Linear Regression

no code implementations8 Dec 2021 Lijia Zhou, Frederic Koehler, Danica J. Sutherland, Nathan Srebro

We study a localized notion of uniform convergence known as an "optimistic rate" (Panchenko 2002; Srebro et al. 2010) for linear regression with Gaussian data.

Uniform Convergence of Interpolators: Gaussian Width, Norm Bounds, and Benign Overfitting

no code implementations NeurIPS 2021 Frederic Koehler, Lijia Zhou, Danica J. Sutherland, Nathan Srebro

We consider interpolation learning in high-dimensional linear regression with Gaussian data, and prove a generic uniform convergence guarantee on the generalization error of interpolators in an arbitrary hypothesis class in terms of the class's Gaussian width.

Generalization Bounds

Self-Supervised Learning with Kernel Dependence Maximization

1 code implementation NeurIPS 2021 Yazhe Li, Roman Pogodin, Danica J. Sutherland, Arthur Gretton

We approach self-supervised learning of image representations from a statistical dependence perspective, proposing Self-Supervised Learning with the Hilbert-Schmidt Independence Criterion (SSL-HSIC).

Depth Estimation Object Recognition +2

Meta Two-Sample Testing: Learning Kernels for Testing with Limited Data

1 code implementation NeurIPS 2021 Feng Liu, Wenkai Xu, Jie Lu, Danica J. Sutherland

In realistic scenarios with very limited numbers of data samples, however, it can be challenging to identify a kernel powerful enough to distinguish complex distributions.

Two-sample testing

Uniform Convergence of Interpolators: Gaussian Width, Norm Bounds and Benign Overfitting

no code implementations NeurIPS 2021 Frederic Koehler, Lijia Zhou, Danica J. Sutherland, Nathan Srebro

We consider interpolation learning in high-dimensional linear regression with Gaussian data, and prove a generic uniform convergence guarantee on the generalization error of interpolators in an arbitrary hypothesis class in terms of the class’s Gaussian width.

Generalization Bounds

Does Invariant Risk Minimization Capture Invariance?

no code implementations4 Jan 2021 Pritish Kamath, Akilesh Tangella, Danica J. Sutherland, Nathan Srebro

We show that the Invariant Risk Minimization (IRM) formulation of Arjovsky et al. (2019) can fail to capture "natural" invariances, at least when used in its practical "linear" form, and even on very simple problems which directly follow the motivating examples for IRM.

On Uniform Convergence and Low-Norm Interpolation Learning

no code implementations NeurIPS 2020 Lijia Zhou, Danica J. Sutherland, Nathan Srebro

But we argue we can explain the consistency of the minimal-norm interpolator with a slightly weaker, yet standard, notion: uniform convergence of zero-error predictors in a norm ball.

Learning Deep Kernels for Non-Parametric Two-Sample Tests

1 code implementation ICML 2020 Feng Liu, Wenkai Xu, Jie Lu, Guangquan Zhang, Arthur Gretton, Danica J. Sutherland

We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution.

Two-sample testing

Unbiased estimators for the variance of MMD estimators

no code implementations5 Jun 2019 Danica J. Sutherland

The maximum mean discrepancy (MMD) is a kernel-based distance between probability distributions useful in many applications (Gretton et al. 2012), bearing a simple estimator with pleasing computational and statistical properties.

Two-sample testing

Learning deep kernels for exponential family densities

1 code implementation20 Nov 2018 Li Wenliang, Danica J. Sutherland, Heiko Strathmann, Arthur Gretton

The kernel exponential family is a rich class of distributions, which can be fit efficiently and with statistical guarantees by score matching.

On gradient regularizers for MMD GANs

1 code implementation NeurIPS 2018 Michael Arbel, Danica J. Sutherland, Mikołaj Bińkowski, Arthur Gretton

We propose a principled method for gradient-based regularization of the critic of GAN-like models trained by adversarially optimizing the kernel of a Maximum Mean Discrepancy (MMD).

Image Generation

Demystifying MMD GANs

4 code implementations ICLR 2018 Mikołaj Bińkowski, Danica J. Sutherland, Michael Arbel, Arthur Gretton

We investigate the training and performance of generative adversarial networks using the Maximum Mean Discrepancy (MMD) as critic, termed MMD GANs.

Efficient and principled score estimation with Nyström kernel exponential families

1 code implementation23 May 2017 Danica J. Sutherland, Heiko Strathmann, Michael Arbel, Arthur Gretton

We propose a fast method with statistical guarantees for learning an exponential family density model where the natural parameter is in a reproducing kernel Hilbert space, and may be infinite-dimensional.

Denoising Density Estimation

Bayesian Approaches to Distribution Regression

1 code implementation11 May 2017 Ho Chung Leon Law, Danica J. Sutherland, Dino Sejdinovic, Seth Flaxman

Distribution regression has recently attracted much interest as a generic solution to the problem of supervised learning where labels are available at the group level, rather than at the individual level.

Frame

Fixing an error in Caponnetto and de Vito (2007)

no code implementations9 Feb 2017 Danica J. Sutherland

The seminal paper of Caponnetto and de Vito (2007) provides minimax-optimal rates for kernel ridge regression in a very general setting.

Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy

1 code implementation14 Nov 2016 Danica J. Sutherland, Hsiao-Yu Tung, Heiko Strathmann, Soumyajit De, Aaditya Ramdas, Alex Smola, Arthur Gretton

In this context, the MMD may be used in two roles: first, as a discriminator, either directly on the samples, or on features of the samples.

Understanding the 2016 US Presidential Election using ecological inference and distribution regression with census microdata

1 code implementation11 Nov 2016 Seth Flaxman, Danica J. Sutherland, Yu-Xiang Wang, Yee Whye Teh

We combine fine-grained spatially referenced census data with the vote outcomes from the 2016 US presidential election.

Deep Mean Maps

no code implementations13 Nov 2015 Junier B. Oliva, Danica J. Sutherland, Barnabás Póczos, Jeff Schneider

The use of distributions and high-level features from deep architecture has become commonplace in modern computer vision.

Linear-time Learning on Distributions with Approximate Kernel Embeddings

no code implementations24 Sep 2015 Danica J. Sutherland, Junier B. Oliva, Barnabás Póczos, Jeff Schneider

This work develops the first random features for pdfs whose dot product approximates kernels using these non-Euclidean metrics, allowing estimators using such kernels to scale to large datasets by working in a primal space, without computing large Gram matrices.

Kernels on Sample Sets via Nonparametric Divergence Estimates

no code implementations1 Feb 2012 Danica J. Sutherland, Liang Xiong, Barnabás Póczos, Jeff Schneider

Most machine learning algorithms, such as classification or regression, treat the individual data point as the object of interest.

Anomaly Detection General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.