no code implementations • 17 May 2023 • Anaelia Ovalle, Palash Goyal, Jwala Dhamala, Zachary Jaggers, Kai-Wei Chang, Aram Galstyan, Richard Zemel, Rahul Gupta
When prompted with gender disclosures, LLM text contained stigmatizing language and scored most toxic when triggered by TGNB gender disclosure.
no code implementations • 13 Apr 2023 • Arjun Mani, Ishaan Preetam Chandratreya, Elliot Creager, Carl Vondrick, Richard Zemel
Modeling the mechanics of fluid in complex scenes is vital to applications in design, graphics, and robotics.
1 code implementation • 27 Dec 2022 • Jake C. Snell, Thomas P. Zollo, Zhun Deng, Toniann Pitassi, Richard Zemel
In this work, we propose a flexible framework to produce a family of bounds on quantiles of the loss distribution incurred by a predictor.
no code implementations • 17 Nov 2022 • Ninareh Mehrabi, Palash Goyal, Apurv Verma, Jwala Dhamala, Varun Kumar, Qian Hu, Kai-Wei Chang, Richard Zemel, Aram Galstyan, Rahul Gupta
Natural language often contains ambiguities that can lead to misinterpretation and miscommunication.
no code implementations • 26 May 2022 • Jimit Majmudar, Christophe Dupuy, Charith Peris, Sami Smaili, Rahul Gupta, Richard Zemel
Recent large-scale natural language processing (NLP) systems use a pre-trained Large Language Model (LLM) on massive and diverse corpora as a headstart.
1 code implementation • NAACL 2022 • Zhewei Sun, Richard Zemel, Yang Xu
Slang is a predominant form of informal language making flexible and extended use of words that is notoriously hard for natural language processing systems to interpret.
no code implementations • LTEDI (ACL) 2022 • António Câmara, Nina Taneja, Tamjeed Azad, Emily Allaway, Richard Zemel
As natural language processing systems become more widespread, it is necessary to address fairness issues in their implementation and deployment to ensure that their negative impacts on society are understood and minimized.
1 code implementation • 14 Feb 2022 • Taiga Abe, E. Kelly Buchanan, Geoff Pleiss, Richard Zemel, John P. Cunningham
While deep ensembles are a practical way to achieve improvements to predictive power, uncertainty quantification, and robustness, our results show that these improvements can be replicated by a (larger) single model.
1 code implementation • NeurIPS 2021 • Kuan-Chieh Wang, Yan Fu, Ke Li, Ashish Khisti, Richard Zemel, Alireza Makhzani
In this work, we provide a probabilistic interpretation of model inversion attacks, and formulate a variational objective that accounts for both diversity and accuracy.
no code implementations • 29 Dec 2021 • Christina M. Funke, Paul Vicol, Kuan-Chieh Wang, Matthias Kümmerer, Richard Zemel, Matthias Bethge
Exploiting such correlations may increase predictive performance on noisy data; however, often correlations are not robust (e. g., they may change between domains, datasets, or applications) and models that exploit them do not generalize when correlations shift.
1 code implementation • NeurIPS 2021 • David Madras, Richard Zemel
Deep learning systems frequently fail at out-of-context (OOC) prediction, the problem of making reliable predictions on uncommon or unusual inputs or subgroups of the training distribution.
1 code implementation • 13 Sep 2021 • Mengye Ren, Tyler R. Scott, Michael L. Iuzzolino, Michael C. Mozer, Richard Zemel
Real world learning scenarios involve a nonstationary distribution of classes with sequential dependencies among the samples, in contrast to the standard machine learning formulation of drawing samples independently from a fixed, typically uniform distribution.
1 code implementation • 19 Jul 2021 • Jacob Kelly, Richard Zemel, Will Grathwohl
We find our models are capable of both accurate, calibrated predictions and high-quality conditional synthesis of novel attribute combinations.
1 code implementation • 25 Jun 2021 • Xiaohui Zeng, Raquel Urtasun, Richard Zemel, Sanja Fidler, Renjie Liao
1) We propose a non-parametric prior distribution over the appearance of image parts so that the latent variable ``what-to-draw'' per step becomes a categorical random variable.
1 code implementation • 14 May 2021 • Eleni Triantafillou, Hugo Larochelle, Richard Zemel, Vincent Dumoulin
Few-shot dataset generalization is a challenging variant of the well-studied few-shot classification problem where a diverse training set of several datasets is given, for the purpose of training an adaptable model that can then learn classes from new datasets using only a few examples.
1 code implementation • 22 Apr 2021 • James Lucas, Juhan Bae, Michael R. Zhang, Stanislav Fort, Richard Zemel, Roger Grosse
Linear interpolation between initial neural network parameters and converged parameters after training with stochastic gradient descent (SGD) typically leads to a monotonic decrease in the training objective.
1 code implementation • 3 Feb 2021 • Zhewei Sun, Richard Zemel, Yang Xu
Slang is a common type of informal language, but its flexible nature and paucity of data resources present challenges for existing natural language systems.
no code implementations • 1 Jan 2021 • Eleni Triantafillou, Vincent Dumoulin, Hugo Larochelle, Richard Zemel
We discover that fine-tuning on episodes of a particular shot can specialize the pre-trained model to solving episodes of that shot at the expense of performance on other shots, in agreement with a trade-off recently observed in the context of end-to-end episodic training.
no code implementations • 1 Jan 2021 • Mengye Ren, Eleni Triantafillou, Kuan-Chieh Wang, James Lucas, Jake Snell, Xaq Pitkow, Andreas S. Tolias, Richard Zemel
In this work, we consider a realistic setting where the relationship between examples can change from episode to episode depending on the task context, which is not given to the learner.
no code implementations • ICLR 2021 • Renjie Liao, Raquel Urtasun, Richard Zemel
In this paper, we derive generalization bounds for the two primary classes of graph neural networks (GNNs), namely graph convolutional networks (GCNs) and message passing GNNs (MPGNNs), via a PAC-Bayesian approach.
no code implementations • 10 Dec 2020 • Mengye Ren, Eleni Triantafillou, Kuan-Chieh Wang, James Lucas, Jake Snell, Xaq Pitkow, Andreas S. Tolias, Richard Zemel
Despite impressive progress in deep learning, generalizing far beyond the training distribution is an important open challenge.
1 code implementation • 12 Nov 2020 • Robert Adragna, Elliot Creager, David Madras, Richard Zemel
Robustness is of central importance in machine learning and has given rise to the fields of domain generalization and invariant learning, which are concerned with improving performance on a test distribution distinct from but related to the training distribution.
1 code implementation • 14 Oct 2020 • Elliot Creager, Jörn-Henrik Jacobsen, Richard Zemel
Learning models that gracefully handle distribution shifts is central to research on domain generalization, robust optimization, and fairness.
Ranked #1 on
Out-of-Distribution Generalization
on ImageNet-W
no code implementations • ICLR 2021 • James Lucas, Mengye Ren, Irene Kameni, Toniann Pitassi, Richard Zemel
Machine learning models have traditionally been developed under the assumption that the training and test distributions match exactly.
no code implementations • 28 Sep 2020 • Elliot Creager, Joern-Henrik Jacobsen, Richard Zemel
Developing learning approaches that are not overly sensitive to the training distribution is central to research on domain- or out-of-distribution generalization, robust optimization and fairness.
no code implementations • ICML 2020 • Martin Mladenov, Elliot Creager, Omer Ben-Porat, Kevin Swersky, Richard Zemel, Craig Boutilier
We develop several scalable techniques to solve the matching problem, and also draw connections to various notions of user regret and fairness, arguing that these outcomes are fairer in a utilitarian sense.
2 code implementations • ICLR 2021 • Jake Snell, Richard Zemel
Few-shot classification (FSC), the task of adapting a classifier to unseen classes given a small labeled dataset, is an important step on the path toward human-like machine learning.
1 code implementation • 18 Jun 2020 • Sindy Löwe, David Madras, Richard Zemel, Max Welling
This enables us to train a single, amortized model that infers causal relations across samples with different underlying causal graphs, and thus leverages the shared dynamics information.
2 code implementations • 16 Apr 2020 • Robert Geirhos, Jörn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland Brendel, Matthias Bethge, Felix A. Wichmann
Deep learning has triggered the current rise of artificial intelligence and is the workhorse of today's machine intelligence.
1 code implementation • ICML 2020 • Will Grathwohl, Kuan-Chieh Wang, Jorn-Henrik Jacobsen, David Duvenaud, Richard Zemel
We estimate the Stein discrepancy between the data density $p(x)$ and the model density $q(x)$ defined by a vector function of the data.
1 code implementation • NeurIPS 2019 • Seyed Kamyar Seyed Ghasemipour, Shixiang (Shane) Gu, Richard Zemel
We examine the efficacy of our method on a variety of high-dimensional simulated continuous control tasks and observe that SMILe significantly outperforms Meta-BC.
2 code implementations • 6 Nov 2019 • Seyed Kamyar Seyed Ghasemipour, Richard Zemel, Shixiang Gu
We present $f$-MAX, an $f$-divergence generalization of AIRL [Fu et al., 2018], a state-of-the-art IRL method.
no code implementations • 25 Sep 2019 • Kuan-Chieh Wang, Paul Vicol, Eleni Triantafillou, Chia-Cheng Liu, Richard Zemel
In this work, we propose tasks for out-of-distribution detection in the few-shot setting and establish benchmark datasets, based on four popular few-shot classification datasets.
1 code implementation • ICML 2020 • Elliot Creager, David Madras, Toniann Pitassi, Richard Zemel
In many application areas---lending, education, and online recommenders, for example---fairness and equity concerns emerge when a machine learning system interacts with a dynamically changing environment to produce both immediate and long-term effects for individuals and demographic groups.
1 code implementation • 22 Jun 2019 • Guangyong Chen, Pengfei Chen, Chang-Yu Hsieh, Chee-Kong Lee, Benben Liao, Renjie Liao, Weiwen Liu, Jiezhong Qiu, Qiming Sun, Jie Tang, Richard Zemel, Shengyu Zhang
We introduce a new molecular dataset, named Alchemy, for developing machine learning models useful in chemistry and material science.
no code implementations • 6 Jun 2019 • Elliot Creager, David Madras, Jörn-Henrik Jacobsen, Marissa A. Weis, Kevin Swersky, Toniann Pitassi, Richard Zemel
We consider the problem of learning representations that achieve group and subgroup fairness with respect to multiple sensitive attributes.
no code implementations • ICLR 2020 • Ethan Fetaya, Jörn-Henrik Jacobsen, Will Grathwohl, Richard Zemel
Class-conditional generative models hold promise to overcome the shortcomings of their discriminative counterparts.
no code implementations • ICLR Workshop DeepGenStruct 2019 • Seyed Kamyar Seyed Ghasemipour, Shane Gu, Richard Zemel
$f$-MAX provides grounds for more directly comparing the objectives for LfD.
no code implementations • 26 Mar 2019 • Amir Rosenfeld, Richard Zemel, John K. Tsotsos
Predicting human perceptual similarity is a challenging subject of ongoing research.
1 code implementation • NeurIPS 2018 • Jack Klys, Jake Snell, Richard Zemel
We consider the problem of unsupervised learning of features correlated to specific labels in a dataset.
no code implementations • ICLR 2019 • Jörn-Henrik Jacobsen, Jens Behrmann, Richard Zemel, Matthias Bethge
Despite their impressive performance, deep neural networks exhibit striking failures on out-of-distribution inputs.
2 code implementations • 8 Oct 2018 • Marc-Etienne Brunet, Colleen Alkalay-Houlihan, Ashton Anderson, Richard Zemel
Given a word embedding trained on a corpus, our method identifies how perturbing the corpus will affect the bias of the resulting embedding.
1 code implementation • NeurIPS 2018 • Lisa Zhang, Gregory Rosenblatt, Ethan Fetaya, Renjie Liao, William E. Byrd, Matthew Might, Raquel Urtasun, Richard Zemel
Synthesizing programs using example input/outputs is a classic problem in artificial intelligence.
no code implementations • 7 Sep 2018 • David Madras, Elliot Creager, Toniann Pitassi, Richard Zemel
Building on prior work in deep learning and generative modeling, we describe how to learn the parameters of this causal model from observational data alone, even in the presence of unobserved confounders.
1 code implementation • 9 Aug 2018 • Amir Rosenfeld, Richard Zemel, John K. Tsotsos
We showcase a family of common failures of state-of-the art object detectors.
no code implementations • ICML 2018 • Kuan-Chieh Wang, Paul Vicol, James Lucas, Li Gu, Roger Grosse, Richard Zemel
We propose a framework, Adversarial Posterior Distillation, to distill the SGLD samples using a Generative Adversarial Network (GAN).
1 code implementation • 27 Jun 2018 • Kuan-Chieh Wang, Paul Vicol, James Lucas, Li Gu, Roger Grosse, Richard Zemel
We propose a framework, Adversarial Posterior Distillation, to distill the SGLD samples using a Generative Adversarial Network (GAN).
1 code implementation • ICLR 2019 • James Lucas, Shengyang Sun, Richard Zemel, Roger Grosse
Momentum is a simple and widely used trick which allows gradient-based optimizers to pick up speed along low curvature directions.
1 code implementation • 21 Mar 2018 • KiJung Yoon, Renjie Liao, Yuwen Xiong, Lisa Zhang, Ethan Fetaya, Raquel Urtasun, Richard Zemel, Xaq Pitkow
Message-passing algorithms, such as belief propagation, are a natural way to disseminate evidence amongst correlated variables while exploiting the graph structure, but these algorithms can struggle when the conditional dependency graphs contain loops.
1 code implementation • ICML 2018 • Renjie Liao, Yuwen Xiong, Ethan Fetaya, Lisa Zhang, KiJung Yoon, Xaq Pitkow, Raquel Urtasun, Richard Zemel
We examine all RBP variants along with BPTT and TBPTT in three different application domains: associative memory with continuous Hopfield networks, document classification in citation networks using graph neural networks and hyperparameter optimization for fully connected networks.
1 code implementation • ICLR 2018 • Renjie Liao, Marc Brockschmidt, Daniel Tarlow, Alexander L. Gaunt, Raquel Urtasun, Richard Zemel
We present graph partition neural networks (GPNN), an extension of graph neural networks (GNNs) able to handle extremely large graphs.
5 code implementations • ICML 2018 • David Madras, Elliot Creager, Toniann Pitassi, Richard Zemel
In this paper, we advocate for representation learning as the key to mitigating unfair prediction outcomes downstream.
9 code implementations • ICML 2018 • Thomas Kipf, Ethan Fetaya, Kuan-Chieh Wang, Max Welling, Richard Zemel
Interacting systems are prevalent in nature, from dynamical systems in physics to complex societal dynamics.
no code implementations • ICLR 2018 • David Madras, Toniann Pitassi, Richard Zemel
When machine learning models are used for high-stakes decisions, they should predict accurately, fairly, and responsibly.
1 code implementation • NeurIPS 2018 • David Madras, Toniann Pitassi, Richard Zemel
We propose a learning algorithm which accounts for potential biases held by external decision-makers in a system.
no code implementations • NeurIPS 2017 • Eleni Triantafillou, Richard Zemel, Raquel Urtasun
Few-shot learning refers to understanding new concepts from only a few examples.
no code implementations • NeurIPS 2017 • Yujia Li, Alexander Schwing, Kuan-Chieh Wang, Richard Zemel
We start from linear discriminators in which case conjugate duality provides a mechanism to reformulate the saddle point objective into a maximization problem, such that both the generator and the discriminator of this 'dualing GAN' act in concert.
5 code implementations • NeurIPS 2017 • Christos Louizos, Uri Shalit, Joris Mooij, David Sontag, Richard Zemel, Max Welling
Learning individual-level causal effects from observational data, such as inferring the most effective medication for a specific patient, is a problem of growing importance for policy makers.
Ranked #9 on
Causal Inference
on IHDP
2 code implementations • NeurIPS 2016 • Wenjie Luo, Yujia Li, Raquel Urtasun, Richard Zemel
We study characteristics of receptive fields of units in deep convolutional networks.
1 code implementation • NeurIPS 2016 • Renjie Liao, Alex Schwing, Richard Zemel, Raquel Urtasun
In this paper we aim at facilitating generalization for deep networks while supporting interpretability of the learned representations.
13 code implementations • 17 Nov 2015 • Yujia Li, Daniel Tarlow, Marc Brockschmidt, Richard Zemel
Graph-structured data appears frequently in domains including chemistry, natural language semantics, social networks, and knowledge bases.
Ranked #1 on
Graph Classification
on IPC-grounded
2 code implementations • 3 Nov 2015 • Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, Richard Zemel
We investigate the problem of learning representations that are invariant to certain nuisance or sensitive factors of variation in the data while retaining as much of the remaining information as possible.
Ranked #4 on
Sentiment Analysis
on Multi-Domain Sentiment Dataset
9 code implementations • ICML deep learning workshop, vol. 2. 2015. 2015 • Gregory Koch, Richard Zemel, Ruslan Salakhutdinov
The process of learning good features for machine learning applications can be very computationally expensive and may prove difficult in cases where little data is available.
Ranked #1 on
One-Shot Learning
on MNIST
(using extra training data)
3 code implementations • ICCV 2015 • Yukun Zhu, Ryan Kiros, Richard Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, Sanja Fidler
Books are a rich source of both fine-grained information, how a character, an object or a scene looks like, as well as high-level semantics, what someone is thinking, feeling and how these states evolve through a story.
3 code implementations • NeurIPS 2015 • Mengye Ren, Ryan Kiros, Richard Zemel
A suite of baseline results on this new dataset are also presented.
Ranked #4 on
Video Question Answering
on SUTD-TrafficQA
3 code implementations • 10 Feb 2015 • Yujia Li, Kevin Swersky, Richard Zemel
We consider the problem of learning deep generative models from data.
85 code implementations • 10 Feb 2015 • Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio
Inspired by recent work in machine translation and object detection, we introduce an attention based model that automatically learns to describe the content of images.
no code implementations • 17 Dec 2014 • Yujia Li, Kevin Swersky, Richard Zemel
Different forms of representation learning can be derived from alternative definitions of unwanted bias, e. g., bias to particular tasks, domains, or irrelevant underlying data dimensions.
1 code implementation • 21 Oct 2014 • Yujia Li, Richard Zemel
The mean field algorithm is a widely used approximate inference algorithm for graphical models whose exact inference is intractable.
no code implementations • NeurIPS 2013 • Jasper Snoek, Richard Zemel, Ryan P. Adams
Point processes are popular models of neural spiking behavior as they provide a statistical distribution over temporal sequences of spikes and help to reveal the complexities underlying a series of recorded action potentials.
no code implementations • NeurIPS 2013 • James Martens, Arkadev Chattopadhya, Toni Pitassi, Richard Zemel
This paper examines the question: What kinds of distributions can be efficiently represented by Restricted Boltzmann Machines (RBMs)?
no code implementations • CVPR 2013 • Yujia Li, Daniel Tarlow, Richard Zemel
In this work, we study the learning of a general class of pattern-like high order potential, which we call Compositional High Order Pattern Potentials (CHOPPs).