Search Results for author: Richard Zemel

Found 64 papers, 39 papers with code

Variational Model Inversion Attacks

1 code implementation NeurIPS 2021 Kuan-Chieh Wang, Yan Fu, Ke Li, Ashish Khisti, Richard Zemel, Alireza Makhzani

In this work, we provide a probabilistic interpretation of model inversion attacks, and formulate a variational objective that accounts for both diversity and accuracy.

Identifying and Benchmarking Natural Out-of-Context Prediction Problems

1 code implementation NeurIPS 2021 David Madras, Richard Zemel

Deep learning systems frequently fail at out-of-context (OOC) prediction, the problem of making reliable predictions on uncommon or unusual inputs or subgroups of the training distribution.

Online Unsupervised Learning of Visual Representations and Categories

no code implementations13 Sep 2021 Mengye Ren, Tyler R. Scott, Michael L. Iuzzolino, Michael C. Mozer, Richard Zemel

Real world learning scenarios involve a nonstationary distribution of classes with sequential dependencies among the samples, in contrast to the standard machine learning formulation of drawing samples independently from a fixed, typically uniform distribution.

Few-Shot Learning Representation Learning +1

Directly Training Joint Energy-Based Models for Conditional Synthesis and Calibrated Prediction of Multi-Attribute Data

1 code implementation19 Jul 2021 Jacob Kelly, Richard Zemel, Will Grathwohl

We find our models are capable of both accurate, calibrated predictions and high-quality conditional synthesis of novel attribute combinations.

Classification

NP-DRAW: A Non-Parametric Structured Latent Variable Model for Image Generation

1 code implementation25 Jun 2021 Xiaohui Zeng, Raquel Urtasun, Richard Zemel, Sanja Fidler, Renjie Liao

1) We propose a non-parametric prior distribution over the appearance of image parts so that the latent variable ``what-to-draw'' per step becomes a categorical random variable.

Image Generation

Learning a Universal Template for Few-shot Dataset Generalization

1 code implementation14 May 2021 Eleni Triantafillou, Hugo Larochelle, Richard Zemel, Vincent Dumoulin

Few-shot dataset generalization is a challenging variant of the well-studied few-shot classification problem where a diverse training set of several datasets is given, for the purpose of training an adaptable model that can then learn classes from new datasets using only a few examples.

Analyzing Monotonic Linear Interpolation in Neural Network Loss Landscapes

1 code implementation22 Apr 2021 James Lucas, Juhan Bae, Michael R. Zhang, Stanislav Fort, Richard Zemel, Roger Grosse

Linear interpolation between initial neural network parameters and converged parameters after training with stochastic gradient descent (SGD) typically leads to a monotonic decrease in the training objective.

A Computational Framework for Slang Generation

1 code implementation3 Feb 2021 Zhewei Sun, Richard Zemel, Yang Xu

Slang is a common type of informal language, but its flexible nature and paucity of data resources present challenges for existing natural language systems.

Contrastive Learning

Learning Flexible Classifiers with Shot-CONditional Episodic (SCONE) Training

no code implementations1 Jan 2021 Eleni Triantafillou, Vincent Dumoulin, Hugo Larochelle, Richard Zemel

We discover that fine-tuning on episodes of a particular shot can specialize the pre-trained model to solving episodes of that shot at the expense of performance on other shots, in agreement with a trade-off recently observed in the context of end-to-end episodic training.

Classification Fine-tuning +1

Exploring representation learning for flexible few-shot tasks

no code implementations1 Jan 2021 Mengye Ren, Eleni Triantafillou, Kuan-Chieh Wang, James Lucas, Jake Snell, Xaq Pitkow, Andreas S. Tolias, Richard Zemel

In this work, we consider a realistic setting where the relationship between examples can change from episode to episode depending on the task context, which is not given to the learner.

Few-Shot Learning Representation Learning

A PAC-Bayesian Approach to Generalization Bounds for Graph Neural Networks

no code implementations ICLR 2021 Renjie Liao, Raquel Urtasun, Richard Zemel

In this paper, we derive generalization bounds for the two primary classes of graph neural networks (GNNs), namely graph convolutional networks (GCNs) and message passing GNNs (MPGNNs), via a PAC-Bayesian approach.

Generalization Bounds

Few-Shot Attribute Learning

no code implementations10 Dec 2020 Mengye Ren, Eleni Triantafillou, Kuan-Chieh Wang, James Lucas, Jake Snell, Xaq Pitkow, Andreas S. Tolias, Richard Zemel

Compared to standard few-shot learning of semantic classes, in which novel classes may be defined by attributes that were relevant at training time, learning new attributes imposes a stiffer challenge.

Few-Shot Learning Zero-Shot Learning

Fairness and Robustness in Invariant Learning: A Case Study in Toxicity Classification

1 code implementation12 Nov 2020 Robert Adragna, Elliot Creager, David Madras, Richard Zemel

Robustness is of central importance in machine learning and has given rise to the fields of domain generalization and invariant learning, which are concerned with improving performance on a test distribution distinct from but related to the training distribution.

Causal Discovery Domain Generalization +2

Environment Inference for Invariant Learning

1 code implementation14 Oct 2020 Elliot Creager, Jörn-Henrik Jacobsen, Richard Zemel

Learning models that gracefully handle distribution shifts is central to research on domain generalization, robust optimization, and fairness.

Domain Generalization Fairness

Theoretical bounds on estimation error for meta-learning

no code implementations ICLR 2021 James Lucas, Mengye Ren, Irene Kameni, Toniann Pitassi, Richard Zemel

Machine learning models have traditionally been developed under the assumption that the training and test distributions match exactly.

Few-Shot Learning

Exchanging Lessons Between Algorithmic Fairness and Domain Generalization

no code implementations28 Sep 2020 Elliot Creager, Joern-Henrik Jacobsen, Richard Zemel

Developing learning approaches that are not overly sensitive to the training distribution is central to research on domain- or out-of-distribution generalization, robust optimization and fairness.

Domain Generalization Fairness

Optimizing Long-term Social Welfare in Recommender Systems: A Constrained Matching Approach

no code implementations ICML 2020 Martin Mladenov, Elliot Creager, Omer Ben-Porat, Kevin Swersky, Richard Zemel, Craig Boutilier

We develop several scalable techniques to solve the matching problem, and also draw connections to various notions of user regret and fairness, arguing that these outcomes are fairer in a utilitarian sense.

Fairness Recommendation Systems

Bayesian Few-Shot Classification with One-vs-Each Pólya-Gamma Augmented Gaussian Processes

2 code implementations ICLR 2021 Jake Snell, Richard Zemel

Few-shot classification (FSC), the task of adapting a classifier to unseen classes given a small labeled dataset, is an important step on the path toward human-like machine learning.

Classification Gaussian Processes +1

Amortized Causal Discovery: Learning to Infer Causal Graphs from Time-Series Data

1 code implementation18 Jun 2020 Sindy Löwe, David Madras, Richard Zemel, Max Welling

Standard causal discovery methods must fit a new model whenever they encounter samples from a new underlying causal graph.

Causal Discovery Time Series

Shortcut Learning in Deep Neural Networks

2 code implementations16 Apr 2020 Robert Geirhos, Jörn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland Brendel, Matthias Bethge, Felix A. Wichmann

Deep learning has triggered the current rise of artificial intelligence and is the workhorse of today's machine intelligence.

Learning the Stein Discrepancy for Training and Evaluating Energy-Based Models without Sampling

1 code implementation ICML 2020 Will Grathwohl, Kuan-Chieh Wang, Jorn-Henrik Jacobsen, David Duvenaud, Richard Zemel

We estimate the Stein discrepancy between the data density $p(x)$ and the model density $q(x)$ defined by a vector function of the data.

SMILe: Scalable Meta Inverse Reinforcement Learning through Context-Conditional Policies

1 code implementation NeurIPS 2019 Seyed Kamyar Seyed Ghasemipour, Shixiang (Shane) Gu, Richard Zemel

We examine the efficacy of our method on a variety of high-dimensional simulated continuous control tasks and observe that SMILe significantly outperforms Meta-BC.

Continuous Control Decision Making +2

A Divergence Minimization Perspective on Imitation Learning Methods

2 code implementations6 Nov 2019 Seyed Kamyar Seyed Ghasemipour, Richard Zemel, Shixiang Gu

We present $f$-MAX, an $f$-divergence generalization of AIRL [Fu et al., 2018], a state-of-the-art IRL method.

Behavioural cloning Continuous Control +1

Out-of-distribution Detection in Few-shot Classification

no code implementations25 Sep 2019 Kuan-Chieh Wang, Paul Vicol, Eleni Triantafillou, Chia-Cheng Liu, Richard Zemel

In this work, we propose tasks for out-of-distribution detection in the few-shot setting and establish benchmark datasets, based on four popular few-shot classification datasets.

Classification Out-of-Distribution Detection

Causal Modeling for Fairness in Dynamical Systems

1 code implementation ICML 2020 Elliot Creager, David Madras, Toniann Pitassi, Richard Zemel

In many application areas---lending, education, and online recommenders, for example---fairness and equity concerns emerge when a machine learning system interacts with a dynamically changing environment to produce both immediate and long-term effects for individuals and demographic groups.

Fairness

Alchemy: A Quantum Chemistry Dataset for Benchmarking AI Models

1 code implementation22 Jun 2019 Guangyong Chen, Pengfei Chen, Chang-Yu Hsieh, Chee-Kong Lee, Benben Liao, Renjie Liao, Weiwen Liu, Jiezhong Qiu, Qiming Sun, Jie Tang, Richard Zemel, Shengyu Zhang

We introduce a new molecular dataset, named Alchemy, for developing machine learning models useful in chemistry and material science.

Flexibly Fair Representation Learning by Disentanglement

no code implementations6 Jun 2019 Elliot Creager, David Madras, Jörn-Henrik Jacobsen, Marissa A. Weis, Kevin Swersky, Toniann Pitassi, Richard Zemel

We consider the problem of learning representations that achieve group and subgroup fairness with respect to multiple sensitive attributes.

Fairness General Classification +1

High-Level Perceptual Similarity is Enabled by Learning Diverse Tasks

no code implementations26 Mar 2019 Amir Rosenfeld, Richard Zemel, John K. Tsotsos

Predicting human perceptual similarity is a challenging subject of ongoing research.

Learning Latent Subspaces in Variational Autoencoders

1 code implementation NeurIPS 2018 Jack Klys, Jake Snell, Richard Zemel

We consider the problem of unsupervised learning of features correlated to specific labels in a dataset.

Excessive Invariance Causes Adversarial Vulnerability

no code implementations ICLR 2019 Jörn-Henrik Jacobsen, Jens Behrmann, Richard Zemel, Matthias Bethge

Despite their impressive performance, deep neural networks exhibit striking failures on out-of-distribution inputs.

Understanding the Origins of Bias in Word Embeddings

2 code implementations8 Oct 2018 Marc-Etienne Brunet, Colleen Alkalay-Houlihan, Ashton Anderson, Richard Zemel

Given a word embedding trained on a corpus, our method identifies how perturbing the corpus will affect the bias of the resulting embedding.

Translation Word Embeddings

Fairness Through Causal Awareness: Learning Latent-Variable Models for Biased Data

no code implementations7 Sep 2018 David Madras, Elliot Creager, Toniann Pitassi, Richard Zemel

Building on prior work in deep learning and generative modeling, we describe how to learn the parameters of this causal model from observational data alone, even in the presence of unobserved confounders.

Fairness General Classification +1

The Elephant in the Room

1 code implementation9 Aug 2018 Amir Rosenfeld, Richard Zemel, John K. Tsotsos

We showcase a family of common failures of state-of-the art object detectors.

Object Detection

Distilling the Posterior in Bayesian Neural Networks

no code implementations ICML 2018 Kuan-Chieh Wang, Paul Vicol, James Lucas, Li Gu, Roger Grosse, Richard Zemel

We propose a framework, Adversarial Posterior Distillation, to distill the SGLD samples using a Generative Adversarial Network (GAN).

Active Learning Anomaly Detection

Adversarial Distillation of Bayesian Neural Network Posteriors

1 code implementation27 Jun 2018 Kuan-Chieh Wang, Paul Vicol, James Lucas, Li Gu, Roger Grosse, Richard Zemel

We propose a framework, Adversarial Posterior Distillation, to distill the SGLD samples using a Generative Adversarial Network (GAN).

Active Learning Anomaly Detection

Aggregated Momentum: Stability Through Passive Damping

2 code implementations ICLR 2019 James Lucas, Shengyang Sun, Richard Zemel, Roger Grosse

Momentum is a simple and widely used trick which allows gradient-based optimizers to pick up speed along low curvature directions.

Inference in Probabilistic Graphical Models by Graph Neural Networks

1 code implementation21 Mar 2018 KiJung Yoon, Renjie Liao, Yuwen Xiong, Lisa Zhang, Ethan Fetaya, Raquel Urtasun, Richard Zemel, Xaq Pitkow

Message-passing algorithms, such as belief propagation, are a natural way to disseminate evidence amongst correlated variables while exploiting the graph structure, but these algorithms can struggle when the conditional dependency graphs contain loops.

Decision Making

Reviving and Improving Recurrent Back-Propagation

1 code implementation ICML 2018 Renjie Liao, Yuwen Xiong, Ethan Fetaya, Lisa Zhang, KiJung Yoon, Xaq Pitkow, Raquel Urtasun, Richard Zemel

We examine all RBP variants along with BPTT and TBPTT in three different application domains: associative memory with continuous Hopfield networks, document classification in citation networks using graph neural networks and hyperparameter optimization for fully connected networks.

Document Classification Hyperparameter Optimization

Neural Relational Inference for Interacting Systems

8 code implementations ICML 2018 Thomas Kipf, Ethan Fetaya, Kuan-Chieh Wang, Max Welling, Richard Zemel

Interacting systems are prevalent in nature, from dynamical systems in physics to complex societal dynamics.

Motion Capture

Predict Responsibly: Increasing Fairness by Learning to Defer

no code implementations ICLR 2018 David Madras, Toniann Pitassi, Richard Zemel

When machine learning models are used for high-stakes decisions, they should predict accurately, fairly, and responsibly.

Decision Making Fairness

Predict Responsibly: Improving Fairness and Accuracy by Learning to Defer

1 code implementation NeurIPS 2018 David Madras, Toniann Pitassi, Richard Zemel

We propose a learning algorithm which accounts for potential biases held by external decision-makers in a system.

Decision Making Fairness

Dualing GANs

no code implementations NeurIPS 2017 Yujia Li, Alexander Schwing, Kuan-Chieh Wang, Richard Zemel

We start from linear discriminators in which case conjugate duality provides a mechanism to reformulate the saddle point objective into a maximization problem, such that both the generator and the discriminator of this 'dualing GAN' act in concert.

Causal Effect Inference with Deep Latent-Variable Models

5 code implementations NeurIPS 2017 Christos Louizos, Uri Shalit, Joris Mooij, David Sontag, Richard Zemel, Max Welling

Learning individual-level causal effects from observational data, such as inferring the most effective medication for a specific patient, is a problem of growing importance for policy makers.

Causal Inference Latent Variable Models

Gated Graph Sequence Neural Networks

8 code implementations17 Nov 2015 Yujia Li, Daniel Tarlow, Marc Brockschmidt, Richard Zemel

Graph-structured data appears frequently in domains including chemistry, natural language semantics, social networks, and knowledge bases.

Drug Discovery Graph Classification +2

The Variational Fair Autoencoder

1 code implementation3 Nov 2015 Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, Richard Zemel

We investigate the problem of learning representations that are invariant to certain nuisance or sensitive factors of variation in the data while retaining as much of the remaining information as possible.

General Classification Sentiment Analysis

Siamese neural networks for one-shot image recognition

8 code implementations ICML deep learning workshop, vol. 2. 2015. 2015 Gregory Koch, Richard Zemel, Ruslan Salakhutdinov

The process of learning good features for machine learning applications can be very computationally expensive and may prove difficult in cases where little data is available.

 Ranked #1 on One-Shot Learning on MNIST (using extra training data)

One-Shot Learning

Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books

3 code implementations ICCV 2015 Yukun Zhu, Ryan Kiros, Richard Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, Sanja Fidler

Books are a rich source of both fine-grained information, how a character, an object or a scene looks like, as well as high-level semantics, what someone is thinking, feeling and how these states evolve through a story.

Sentence Embedding

Generative Moment Matching Networks

3 code implementations10 Feb 2015 Yujia Li, Kevin Swersky, Richard Zemel

We consider the problem of learning deep generative models from data.

Two-sample testing

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

74 code implementations10 Feb 2015 Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio

Inspired by recent work in machine translation and object detection, we introduce an attention based model that automatically learns to describe the content of images.

Image Captioning Translation

Learning unbiased features

no code implementations17 Dec 2014 Yujia Li, Kevin Swersky, Richard Zemel

Different forms of representation learning can be derived from alternative definitions of unwanted bias, e. g., bias to particular tasks, domains, or irrelevant underlying data dimensions.

Domain Adaptation Representation Learning +1

Mean-Field Networks

1 code implementation21 Oct 2014 Yujia Li, Richard Zemel

The mean field algorithm is a widely used approximate inference algorithm for graphical models whose exact inference is intractable.

A Determinantal Point Process Latent Variable Model for Inhibition in Neural Spiking Data

no code implementations NeurIPS 2013 Jasper Snoek, Richard Zemel, Ryan P. Adams

Point processes are popular models of neural spiking behavior as they provide a statistical distribution over temporal sequences of spikes and help to reveal the complexities underlying a series of recorded action potentials.

Hippocampus Point Processes

On the Representational Efficiency of Restricted Boltzmann Machines

no code implementations NeurIPS 2013 James Martens, Arkadev Chattopadhya, Toni Pitassi, Richard Zemel

This paper examines the question: What kinds of distributions can be efficiently represented by Restricted Boltzmann Machines (RBMs)?

Exploring Compositional High Order Pattern Potentials for Structured Output Learning

no code implementations CVPR 2013 Yujia Li, Daniel Tarlow, Richard Zemel

In this work, we study the learning of a general class of pattern-like high order potential, which we call Compositional High Order Pattern Potentials (CHOPPs).

Cannot find the paper you are looking for? You can Submit a new open access paper.