Search Results for author: Richard Zemel

Found 73 papers, 44 papers with code

SURFSUP: Learning Fluid Simulation for Novel Surfaces

no code implementations13 Apr 2023 Arjun Mani, Ishaan Preetam Chandratreya, Elliot Creager, Carl Vondrick, Richard Zemel

Modeling the mechanics of fluid in complex scenes is vital to applications in design, graphics, and robotics.

Quantile Risk Control: A Flexible Framework for Bounding the Probability of High-Loss Predictions

1 code implementation27 Dec 2022 Jake C. Snell, Thomas P. Zollo, Zhun Deng, Toniann Pitassi, Richard Zemel

In this work, we propose a flexible framework to produce a family of bounds on quantiles of the loss distribution incurred by a predictor.

Differentially Private Decoding in Large Language Models

no code implementations26 May 2022 Jimit Majmudar, Christophe Dupuy, Charith Peris, Sami Smaili, Rahul Gupta, Richard Zemel

Recent large-scale natural language processing (NLP) systems use a pre-trained Large Language Model (LLM) on massive and diverse corpora as a headstart.

Language Modelling Privacy Preserving

Semantically Informed Slang Interpretation

1 code implementation NAACL 2022 Zhewei Sun, Richard Zemel, Yang Xu

Slang is a predominant form of informal language making flexible and extended use of words that is notoriously hard for natural language processing systems to interpret.

Machine Translation Translation

Mapping the Multilingual Margins: Intersectional Biases of Sentiment Analysis Systems in English, Spanish, and Arabic

no code implementations LTEDI (ACL) 2022 António Câmara, Nina Taneja, Tamjeed Azad, Emily Allaway, Richard Zemel

As natural language processing systems become more widespread, it is necessary to address fairness issues in their implementation and deployment to ensure that their negative impacts on society are understood and minimized.

Fairness regression +1

Deep Ensembles Work, But Are They Necessary?

1 code implementation14 Feb 2022 Taiga Abe, E. Kelly Buchanan, Geoff Pleiss, Richard Zemel, John P. Cunningham

While deep ensembles are a practical way to achieve improvements to predictive power, uncertainty quantification, and robustness, our results show that these improvements can be replicated by a (larger) single model.

Variational Model Inversion Attacks

1 code implementation NeurIPS 2021 Kuan-Chieh Wang, Yan Fu, Ke Li, Ashish Khisti, Richard Zemel, Alireza Makhzani

In this work, we provide a probabilistic interpretation of model inversion attacks, and formulate a variational objective that accounts for both diversity and accuracy.

Disentanglement and Generalization Under Correlation Shifts

no code implementations29 Dec 2021 Christina M. Funke, Paul Vicol, Kuan-Chieh Wang, Matthias Kümmerer, Richard Zemel, Matthias Bethge

Exploiting such correlations may increase predictive performance on noisy data; however, often correlations are not robust (e. g., they may change between domains, datasets, or applications) and models that exploit them do not generalize when correlations shift.


Identifying and Benchmarking Natural Out-of-Context Prediction Problems

1 code implementation NeurIPS 2021 David Madras, Richard Zemel

Deep learning systems frequently fail at out-of-context (OOC) prediction, the problem of making reliable predictions on uncommon or unusual inputs or subgroups of the training distribution.


Online Unsupervised Learning of Visual Representations and Categories

1 code implementation13 Sep 2021 Mengye Ren, Tyler R. Scott, Michael L. Iuzzolino, Michael C. Mozer, Richard Zemel

Real world learning scenarios involve a nonstationary distribution of classes with sequential dependencies among the samples, in contrast to the standard machine learning formulation of drawing samples independently from a fixed, typically uniform distribution.

Few-Shot Learning Representation Learning +1

Directly Training Joint Energy-Based Models for Conditional Synthesis and Calibrated Prediction of Multi-Attribute Data

1 code implementation19 Jul 2021 Jacob Kelly, Richard Zemel, Will Grathwohl

We find our models are capable of both accurate, calibrated predictions and high-quality conditional synthesis of novel attribute combinations.

NP-DRAW: A Non-Parametric Structured Latent Variable Model for Image Generation

1 code implementation25 Jun 2021 Xiaohui Zeng, Raquel Urtasun, Richard Zemel, Sanja Fidler, Renjie Liao

1) We propose a non-parametric prior distribution over the appearance of image parts so that the latent variable ``what-to-draw'' per step becomes a categorical random variable.

Image Generation

Learning a Universal Template for Few-shot Dataset Generalization

1 code implementation14 May 2021 Eleni Triantafillou, Hugo Larochelle, Richard Zemel, Vincent Dumoulin

Few-shot dataset generalization is a challenging variant of the well-studied few-shot classification problem where a diverse training set of several datasets is given, for the purpose of training an adaptable model that can then learn classes from new datasets using only a few examples.

Analyzing Monotonic Linear Interpolation in Neural Network Loss Landscapes

1 code implementation22 Apr 2021 James Lucas, Juhan Bae, Michael R. Zhang, Stanislav Fort, Richard Zemel, Roger Grosse

Linear interpolation between initial neural network parameters and converged parameters after training with stochastic gradient descent (SGD) typically leads to a monotonic decrease in the training objective.

A Computational Framework for Slang Generation

1 code implementation3 Feb 2021 Zhewei Sun, Richard Zemel, Yang Xu

Slang is a common type of informal language, but its flexible nature and paucity of data resources present challenges for existing natural language systems.

Contrastive Learning

Learning Flexible Classifiers with Shot-CONditional Episodic (SCONE) Training

no code implementations1 Jan 2021 Eleni Triantafillou, Vincent Dumoulin, Hugo Larochelle, Richard Zemel

We discover that fine-tuning on episodes of a particular shot can specialize the pre-trained model to solving episodes of that shot at the expense of performance on other shots, in agreement with a trade-off recently observed in the context of end-to-end episodic training.

Classification General Classification

Exploring representation learning for flexible few-shot tasks

no code implementations1 Jan 2021 Mengye Ren, Eleni Triantafillou, Kuan-Chieh Wang, James Lucas, Jake Snell, Xaq Pitkow, Andreas S. Tolias, Richard Zemel

In this work, we consider a realistic setting where the relationship between examples can change from episode to episode depending on the task context, which is not given to the learner.

Few-Shot Learning Representation Learning

A PAC-Bayesian Approach to Generalization Bounds for Graph Neural Networks

no code implementations ICLR 2021 Renjie Liao, Raquel Urtasun, Richard Zemel

In this paper, we derive generalization bounds for the two primary classes of graph neural networks (GNNs), namely graph convolutional networks (GCNs) and message passing GNNs (MPGNNs), via a PAC-Bayesian approach.

Generalization Bounds

Probing Few-Shot Generalization with Attributes

no code implementations10 Dec 2020 Mengye Ren, Eleni Triantafillou, Kuan-Chieh Wang, James Lucas, Jake Snell, Xaq Pitkow, Andreas S. Tolias, Richard Zemel

Despite impressive progress in deep learning, generalizing far beyond the training distribution is an important open challenge.

Few-Shot Learning Zero-Shot Learning

Fairness and Robustness in Invariant Learning: A Case Study in Toxicity Classification

1 code implementation12 Nov 2020 Robert Adragna, Elliot Creager, David Madras, Richard Zemel

Robustness is of central importance in machine learning and has given rise to the fields of domain generalization and invariant learning, which are concerned with improving performance on a test distribution distinct from but related to the training distribution.

BIG-bench Machine Learning Causal Discovery +3

Environment Inference for Invariant Learning

1 code implementation14 Oct 2020 Elliot Creager, Jörn-Henrik Jacobsen, Richard Zemel

Learning models that gracefully handle distribution shifts is central to research on domain generalization, robust optimization, and fairness.

Domain Generalization Fairness +1

Theoretical bounds on estimation error for meta-learning

no code implementations ICLR 2021 James Lucas, Mengye Ren, Irene Kameni, Toniann Pitassi, Richard Zemel

Machine learning models have traditionally been developed under the assumption that the training and test distributions match exactly.

Few-Shot Learning

Exchanging Lessons Between Algorithmic Fairness and Domain Generalization

no code implementations28 Sep 2020 Elliot Creager, Joern-Henrik Jacobsen, Richard Zemel

Developing learning approaches that are not overly sensitive to the training distribution is central to research on domain- or out-of-distribution generalization, robust optimization and fairness.

Domain Generalization Fairness +1

Optimizing Long-term Social Welfare in Recommender Systems: A Constrained Matching Approach

no code implementations ICML 2020 Martin Mladenov, Elliot Creager, Omer Ben-Porat, Kevin Swersky, Richard Zemel, Craig Boutilier

We develop several scalable techniques to solve the matching problem, and also draw connections to various notions of user regret and fairness, arguing that these outcomes are fairer in a utilitarian sense.

Fairness Recommendation Systems

Bayesian Few-Shot Classification with One-vs-Each Pólya-Gamma Augmented Gaussian Processes

2 code implementations ICLR 2021 Jake Snell, Richard Zemel

Few-shot classification (FSC), the task of adapting a classifier to unseen classes given a small labeled dataset, is an important step on the path toward human-like machine learning.

Classification Gaussian Processes +1

Amortized Causal Discovery: Learning to Infer Causal Graphs from Time-Series Data

1 code implementation18 Jun 2020 Sindy Löwe, David Madras, Richard Zemel, Max Welling

This enables us to train a single, amortized model that infers causal relations across samples with different underlying causal graphs, and thus leverages the shared dynamics information.

Causal Discovery Time Series Analysis

Shortcut Learning in Deep Neural Networks

2 code implementations16 Apr 2020 Robert Geirhos, Jörn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland Brendel, Matthias Bethge, Felix A. Wichmann

Deep learning has triggered the current rise of artificial intelligence and is the workhorse of today's machine intelligence.


Learning the Stein Discrepancy for Training and Evaluating Energy-Based Models without Sampling

1 code implementation ICML 2020 Will Grathwohl, Kuan-Chieh Wang, Jorn-Henrik Jacobsen, David Duvenaud, Richard Zemel

We estimate the Stein discrepancy between the data density $p(x)$ and the model density $q(x)$ defined by a vector function of the data.

SMILe: Scalable Meta Inverse Reinforcement Learning through Context-Conditional Policies

1 code implementation NeurIPS 2019 Seyed Kamyar Seyed Ghasemipour, Shixiang (Shane) Gu, Richard Zemel

We examine the efficacy of our method on a variety of high-dimensional simulated continuous control tasks and observe that SMILe significantly outperforms Meta-BC.

Continuous Control Few-Shot Learning +3

A Divergence Minimization Perspective on Imitation Learning Methods

2 code implementations6 Nov 2019 Seyed Kamyar Seyed Ghasemipour, Richard Zemel, Shixiang Gu

We present $f$-MAX, an $f$-divergence generalization of AIRL [Fu et al., 2018], a state-of-the-art IRL method.

Behavioural cloning Continuous Control

Out-of-distribution Detection in Few-shot Classification

no code implementations25 Sep 2019 Kuan-Chieh Wang, Paul Vicol, Eleni Triantafillou, Chia-Cheng Liu, Richard Zemel

In this work, we propose tasks for out-of-distribution detection in the few-shot setting and establish benchmark datasets, based on four popular few-shot classification datasets.

Classification Out-of-Distribution Detection

Causal Modeling for Fairness in Dynamical Systems

1 code implementation ICML 2020 Elliot Creager, David Madras, Toniann Pitassi, Richard Zemel

In many application areas---lending, education, and online recommenders, for example---fairness and equity concerns emerge when a machine learning system interacts with a dynamically changing environment to produce both immediate and long-term effects for individuals and demographic groups.


Flexibly Fair Representation Learning by Disentanglement

no code implementations6 Jun 2019 Elliot Creager, David Madras, Jörn-Henrik Jacobsen, Marissa A. Weis, Kevin Swersky, Toniann Pitassi, Richard Zemel

We consider the problem of learning representations that achieve group and subgroup fairness with respect to multiple sensitive attributes.

Disentanglement Fairness +1

Learning Latent Subspaces in Variational Autoencoders

1 code implementation NeurIPS 2018 Jack Klys, Jake Snell, Richard Zemel

We consider the problem of unsupervised learning of features correlated to specific labels in a dataset.

Excessive Invariance Causes Adversarial Vulnerability

no code implementations ICLR 2019 Jörn-Henrik Jacobsen, Jens Behrmann, Richard Zemel, Matthias Bethge

Despite their impressive performance, deep neural networks exhibit striking failures on out-of-distribution inputs.

Understanding the Origins of Bias in Word Embeddings

2 code implementations8 Oct 2018 Marc-Etienne Brunet, Colleen Alkalay-Houlihan, Ashton Anderson, Richard Zemel

Given a word embedding trained on a corpus, our method identifies how perturbing the corpus will affect the bias of the resulting embedding.

BIG-bench Machine Learning Translation +1

Fairness Through Causal Awareness: Learning Latent-Variable Models for Biased Data

no code implementations7 Sep 2018 David Madras, Elliot Creager, Toniann Pitassi, Richard Zemel

Building on prior work in deep learning and generative modeling, we describe how to learn the parameters of this causal model from observational data alone, even in the presence of unobserved confounders.

Fairness General Classification

The Elephant in the Room

1 code implementation9 Aug 2018 Amir Rosenfeld, Richard Zemel, John K. Tsotsos

We showcase a family of common failures of state-of-the art object detectors.

object-detection Object Detection

Distilling the Posterior in Bayesian Neural Networks

no code implementations ICML 2018 Kuan-Chieh Wang, Paul Vicol, James Lucas, Li Gu, Roger Grosse, Richard Zemel

We propose a framework, Adversarial Posterior Distillation, to distill the SGLD samples using a Generative Adversarial Network (GAN).

Active Learning Anomaly Detection

Adversarial Distillation of Bayesian Neural Network Posteriors

1 code implementation27 Jun 2018 Kuan-Chieh Wang, Paul Vicol, James Lucas, Li Gu, Roger Grosse, Richard Zemel

We propose a framework, Adversarial Posterior Distillation, to distill the SGLD samples using a Generative Adversarial Network (GAN).

Active Learning Anomaly Detection

Aggregated Momentum: Stability Through Passive Damping

1 code implementation ICLR 2019 James Lucas, Shengyang Sun, Richard Zemel, Roger Grosse

Momentum is a simple and widely used trick which allows gradient-based optimizers to pick up speed along low curvature directions.

Inference in Probabilistic Graphical Models by Graph Neural Networks

1 code implementation21 Mar 2018 KiJung Yoon, Renjie Liao, Yuwen Xiong, Lisa Zhang, Ethan Fetaya, Raquel Urtasun, Richard Zemel, Xaq Pitkow

Message-passing algorithms, such as belief propagation, are a natural way to disseminate evidence amongst correlated variables while exploiting the graph structure, but these algorithms can struggle when the conditional dependency graphs contain loops.

Decision Making

Reviving and Improving Recurrent Back-Propagation

1 code implementation ICML 2018 Renjie Liao, Yuwen Xiong, Ethan Fetaya, Lisa Zhang, KiJung Yoon, Xaq Pitkow, Raquel Urtasun, Richard Zemel

We examine all RBP variants along with BPTT and TBPTT in three different application domains: associative memory with continuous Hopfield networks, document classification in citation networks using graph neural networks and hyperparameter optimization for fully connected networks.

Document Classification Hyperparameter Optimization

Neural Relational Inference for Interacting Systems

9 code implementations ICML 2018 Thomas Kipf, Ethan Fetaya, Kuan-Chieh Wang, Max Welling, Richard Zemel

Interacting systems are prevalent in nature, from dynamical systems in physics to complex societal dynamics.

Predict Responsibly: Increasing Fairness by Learning to Defer

no code implementations ICLR 2018 David Madras, Toniann Pitassi, Richard Zemel

When machine learning models are used for high-stakes decisions, they should predict accurately, fairly, and responsibly.

Decision Making Fairness

Predict Responsibly: Improving Fairness and Accuracy by Learning to Defer

1 code implementation NeurIPS 2018 David Madras, Toniann Pitassi, Richard Zemel

We propose a learning algorithm which accounts for potential biases held by external decision-makers in a system.

Decision Making Fairness

Dualing GANs

no code implementations NeurIPS 2017 Yujia Li, Alexander Schwing, Kuan-Chieh Wang, Richard Zemel

We start from linear discriminators in which case conjugate duality provides a mechanism to reformulate the saddle point objective into a maximization problem, such that both the generator and the discriminator of this 'dualing GAN' act in concert.

Causal Effect Inference with Deep Latent-Variable Models

5 code implementations NeurIPS 2017 Christos Louizos, Uri Shalit, Joris Mooij, David Sontag, Richard Zemel, Max Welling

Learning individual-level causal effects from observational data, such as inferring the most effective medication for a specific patient, is a problem of growing importance for policy makers.

Causal Inference

Gated Graph Sequence Neural Networks

13 code implementations17 Nov 2015 Yujia Li, Daniel Tarlow, Marc Brockschmidt, Richard Zemel

Graph-structured data appears frequently in domains including chemistry, natural language semantics, social networks, and knowledge bases.

Drug Discovery Graph Classification +2

The Variational Fair Autoencoder

2 code implementations3 Nov 2015 Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, Richard Zemel

We investigate the problem of learning representations that are invariant to certain nuisance or sensitive factors of variation in the data while retaining as much of the remaining information as possible.

General Classification Sentiment Analysis

Siamese neural networks for one-shot image recognition

9 code implementations ICML deep learning workshop, vol. 2. 2015. 2015 Gregory Koch, Richard Zemel, Ruslan Salakhutdinov

The process of learning good features for machine learning applications can be very computationally expensive and may prove difficult in cases where little data is available.

 Ranked #1 on One-Shot Learning on MNIST (using extra training data)

One-Shot Learning

Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books

3 code implementations ICCV 2015 Yukun Zhu, Ryan Kiros, Richard Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, Sanja Fidler

Books are a rich source of both fine-grained information, how a character, an object or a scene looks like, as well as high-level semantics, what someone is thinking, feeling and how these states evolve through a story.

Sentence Embedding Sentence-Embedding

Generative Moment Matching Networks

3 code implementations10 Feb 2015 Yujia Li, Kevin Swersky, Richard Zemel

We consider the problem of learning deep generative models from data.

Two-sample testing

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

85 code implementations10 Feb 2015 Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio

Inspired by recent work in machine translation and object detection, we introduce an attention based model that automatically learns to describe the content of images.

Image Captioning Translation

Learning unbiased features

no code implementations17 Dec 2014 Yujia Li, Kevin Swersky, Richard Zemel

Different forms of representation learning can be derived from alternative definitions of unwanted bias, e. g., bias to particular tasks, domains, or irrelevant underlying data dimensions.

Domain Adaptation Representation Learning +1

Mean-Field Networks

1 code implementation21 Oct 2014 Yujia Li, Richard Zemel

The mean field algorithm is a widely used approximate inference algorithm for graphical models whose exact inference is intractable.

A Determinantal Point Process Latent Variable Model for Inhibition in Neural Spiking Data

no code implementations NeurIPS 2013 Jasper Snoek, Richard Zemel, Ryan P. Adams

Point processes are popular models of neural spiking behavior as they provide a statistical distribution over temporal sequences of spikes and help to reveal the complexities underlying a series of recorded action potentials.

Hippocampus Point Processes +1

On the Representational Efficiency of Restricted Boltzmann Machines

no code implementations NeurIPS 2013 James Martens, Arkadev Chattopadhya, Toni Pitassi, Richard Zemel

This paper examines the question: What kinds of distributions can be efficiently represented by Restricted Boltzmann Machines (RBMs)?

Exploring Compositional High Order Pattern Potentials for Structured Output Learning

no code implementations CVPR 2013 Yujia Li, Daniel Tarlow, Richard Zemel

In this work, we study the learning of a general class of pattern-like high order potential, which we call Compositional High Order Pattern Potentials (CHOPPs).

Vocal Bursts Intensity Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.