You need to log in to edit.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

1 code implementation • ICLR 2022 • Aviral Kumar, Amir Yazdanbakhsh, Milad Hashemi, Kevin Swersky, Sergey Levine

An alternative paradigm is to use a "data-driven", offline approach that utilizes logged simulation data, to architect hardware accelerators, without needing any form of simulations.

2 code implementations • 16 Sep 2021 • Zi Wang, George E. Dahl, Kevin Swersky, Chansoo Lee, Zelda Mariet, Zachary Nado, Justin Gilmer, Jasper Snoek, Zoubin Ghahramani

Contrary to a common belief that BO is suited to optimizing black-box functions, it actually requires domain knowledge on characteristics of those functions to deploy BO successfully.

1 code implementation • 12 Feb 2021 • Yujun Yan, Milad Hashemi, Kevin Swersky, Yaoqing Yang, Danai Koutra

In our theoretical analysis, we show that the common causes of the heterophily and oversmoothing problems--namely, the relative degree of a node and its heterophily level--trigger the node representations in consecutive layers to "move" closer to the original decision boundary, which increases the misclassification rate of node labels under certain constraints.

Ranked #6 on Node Classification on Chameleon

1 code implementation • 8 Feb 2021 • Will Grathwohl, Kevin Swersky, Milad Hashemi, David Duvenaud, Chris J. Maddison

We propose a general and scalable approximate sampling strategy for probabilistic models with discrete variables.

no code implementations • 2 Feb 2021 • Amir Yazdanbakhsh, Christof Angermueller, Berkin Akin, Yanqi Zhou, Albin Jones, Milad Hashemi, Kevin Swersky, Satrajit Chatterjee, Ravi Narayanaswami, James Laudon

We further show that by transferring knowledge between target architectures with different design constraints, Apollo is able to find optimal configurations faster and often with better objective value (up to 25% improvements).

no code implementations • 18 Dec 2020 • Francis Williams, Or Litany, Avneesh Sud, Kevin Swersky, Andrea Tagliasacchi

We introduce a technique for 3D human keypoint estimation that directly models the notion of spatial uncertainty of a keypoint.

1 code implementation • ICLR 2021 • Will Grathwohl, Jacob Kelly, Milad Hashemi, Mohammad Norouzi, Kevin Swersky, David Duvenaud

Energy-Based Models (EBMs) present a flexible and appealing way to represent uncertainty.

no code implementations • 5 Oct 2020 • Zhan Shi, Chirag Sakhuja, Milad Hashemi, Kevin Swersky, Calvin Lin

The use of deep learning has grown at an exponential rate, giving rise to numerous specialized hardware and software systems for deep learning.

no code implementations • ICML 2020 • Martin Mladenov, Elliot Creager, Omer Ben-Porat, Kevin Swersky, Richard Zemel, Craig Boutilier

We develop several scalable techniques to solve the matching problem, and also draw connections to various notions of user regret and fairness, arguing that these outcomes are fairer in a utilitarian sense.

1 code implementation • ICML 2020 • Evan Zheran Liu, Milad Hashemi, Kevin Swersky, Parthasarathy Ranganathan, Junwhan Ahn

While directly applying Belady's is infeasible since the future is unknown, we train a policy conditioned only on past accesses that accurately approximates Belady's even on diverse and complex access patterns, and call this approach Parrot.

8 code implementations • NeurIPS 2020 • Ting Chen, Simon Kornblith, Kevin Swersky, Mohammad Norouzi, Geoffrey Hinton

The proposed semi-supervised learning algorithm can be summarized in three steps: unsupervised pretraining of a big ResNet model using SimCLRv2, supervised fine-tuning on a few labeled examples, and distillation with unlabeled examples for refining and transferring the task-specific knowledge.

Self-Supervised Image Classification Semi-Supervised Image Classification

1 code implementation • NeurIPS 2020 • Yujun Yan, Kevin Swersky, Danai Koutra, Parthasarathy Ranganathan, Milad Hashemi

A significant effort has been made to train neural networks that replicate algorithmic reasoning, but they often fail to learn the abstract concepts underlying these algorithms.

1 code implementation • 18 Feb 2020 • Micha Livne, Kevin Swersky, David J. Fleet

MIM learning encourages high mutual information between observations and latent variables, and is robust against posterior collapse.

Ranked #1 on Question Answering on YahooCQA (using extra training data)

no code implementations • ICLR 2020 • Yujun Yan, Kevin Swersky, Danai Koutra, Parthasarathy Ranganathan, Milad Hashemi

Turing complete computation and reasoning are often regarded as necessary pre- cursors to general intelligence.

4 code implementations • ICLR 2020 • Will Grathwohl, Kuan-Chieh Wang, Jörn-Henrik Jacobsen, David Duvenaud, Mohammad Norouzi, Kevin Swersky

In this setting, the standard class probabilities can be easily computed as well as unnormalized values of p(x) and p(x|y).

1 code implementation • 8 Oct 2019 • Micha Livne, Kevin Swersky, David J. Fleet

Experiments show that MIM learns representations with high mutual information, consistent encoding and decoding distributions, effective latent clustering, and data log likelihood comparable to VAE, while avoiding posterior collapse.

no code implementations • 4 Oct 2019 • Micha Livne, Kevin Swersky, David J. Fleet

We introduce the Mutual Information Machine (MIM), a novel formulation of representation learning, using a joint distribution over the observations and latent state in an encoder/decoder framework.

no code implementations • ICLR 2020 • Zhan Shi, Kevin Swersky, Daniel Tarlow, Parthasarathy Ranganathan, Milad Hashemi

In this work, we propose a new approach to use GNNs to learn fused representations of general source code and its execution.

no code implementations • 6 Jun 2019 • Elliot Creager, David Madras, Jörn-Henrik Jacobsen, Marissa A. Weis, Kevin Swersky, Toniann Pitassi, Richard Zemel

We consider the problem of learning representations that achieve group and subgroup fairness with respect to multiple sensitive attributes.

2 code implementations • 31 May 2019 • Aidan N. Gomez, Ivan Zhang, Siddhartha Rao Kamalakara, Divyam Madaan, Kevin Swersky, Yarin Gal, Geoffrey E. Hinton

Before computing the gradients for each weight update, targeted dropout stochastically selects a set of units or weights to be dropped using a simple self-reinforcing sparsity criterion and then computes the gradients for the remaining weights.

1 code implementation • NeurIPS 2019 • Jenny Liu, Aviral Kumar, Jimmy Ba, Jamie Kiros, Kevin Swersky

We introduce graph normalizing flows: a new, reversible graph neural network model for prediction and generation.

no code implementations • 4 Apr 2019 • Rui Zhao, David Bieber, Kevin Swersky, Daniel Tarlow

In this work, we instead treat source code as a dynamic object and tackle the problem of modeling the edits that software developers make to source code files.

10 code implementations • ICLR 2020 • Eleni Triantafillou, Tyler Zhu, Vincent Dumoulin, Pascal Lamblin, Utku Evci, Kelvin Xu, Ross Goroshin, Carles Gelada, Kevin Swersky, Pierre-Antoine Manzagol, Hugo Larochelle

Few-shot classification refers to learning a classifier for new classes given only a few examples.

Ranked #7 on Few-Shot Image Classification on Meta-Dataset Rank

1 code implementation • NIPS Workshop CDNNRIA 2018 • Aidan N. Gomez, Ivan Zhang, Kevin Swersky, Yarin Gal, Geoffrey E. Hinton

Neural networks are extremely flexible models due to their large number of parameters, which is beneficial for learning, but also highly redundant.

no code implementations • ICML 2018 • Milad Hashemi, Kevin Swersky, Jamie A. Smith, Grant Ayers, Heiner Litz, Jichuan Chang, Christos Kozyrakis, Parthasarathy Ranganathan

In this paper, we demonstrate the potential of deep learning to address the von Neumann bottleneck of memory performance.

8 code implementations • ICLR 2018 • Mengye Ren, Eleni Triantafillou, Sachin Ravi, Jake Snell, Kevin Swersky, Joshua B. Tenenbaum, Hugo Larochelle, Richard S. Zemel

To address this paradigm, we propose novel extensions of Prototypical Networks (Snell et al., 2017) that are augmented with the ability to use unlabeled examples when producing prototypes.

no code implementations • 16 Jun 2017 • Chung-Cheng Chiu, Dieterich Lawson, Yuping Luo, George Tucker, Kevin Swersky, Ilya Sutskever, Navdeep Jaitly

This is because the models require that the entirety of the input sequence be available at the beginning of inference, an assumption that is not valid for instantaneous speech recognition.

no code implementations • 16 May 2017 • Dieterich Lawson, Chung-Cheng Chiu, George Tucker, Colin Raffel, Kevin Swersky, Navdeep Jaitly

There has recently been significant interest in hard attention models for tasks such as object recognition, visual captioning and speech recognition.

40 code implementations • NeurIPS 2017 • Jake Snell, Kevin Swersky, Richard S. Zemel

We propose prototypical networks for the problem of few-shot classification, where a classifier must generalize to new classes not seen in the training set, given only a small number of examples of each new class.

2 code implementations • 3 Nov 2015 • Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, Richard Zemel

We investigate the problem of learning representations that are invariant to certain nuisance or sensitive factors of variation in the data while retaining as much of the remaining information as possible.

Ranked #4 on Sentiment Analysis on Multi-Domain Sentiment Dataset

no code implementations • ICCV 2015 • Jimmy Ba, Kevin Swersky, Sanja Fidler, Ruslan Salakhutdinov

One of the main challenges in Zero-Shot Learning of visual categories is gathering semantic attributes to accompany images.

4 code implementations • 19 Feb 2015 • Jasper Snoek, Oren Rippel, Kevin Swersky, Ryan Kiros, Nadathur Satish, Narayanan Sundaram, Md. Mostofa Ali Patwary, Prabhat, Ryan P. Adams

Bayesian optimization is an effective methodology for the global optimization of functions with expensive evaluations.

Ranked #127 on Image Classification on CIFAR-100

3 code implementations • 10 Feb 2015 • Yujia Li, Kevin Swersky, Richard Zemel

We consider the problem of learning deep generative models from data.

no code implementations • 17 Dec 2014 • Yujia Li, Kevin Swersky, Richard Zemel

Different forms of representation learning can be derived from alternative definitions of unwanted bias, e. g., bias to particular tasks, domains, or irrelevant underlying data dimensions.

no code implementations • 14 Sep 2014 • Kevin Swersky, David Duvenaud, Jasper Snoek, Frank Hutter, Michael A. Osborne

In practical Bayesian optimization, we must often search over structures with differing numbers of parameters.

no code implementations • 16 Jun 2014 • Kevin Swersky, Jasper Snoek, Ryan Prescott Adams

In this paper we develop a dynamic form of Bayesian optimization for machine learning models with the goal of rapidly finding good hyperparameter settings.

1 code implementation • 5 Feb 2014 • Jasper Snoek, Kevin Swersky, Richard S. Zemel, Ryan P. Adams

Bayesian optimization has proven to be a highly effective methodology for the global optimization of unknown, expensive and multimodal functions.

1 code implementation • NeurIPS 2013 • Kevin Swersky, Jasper Snoek, Ryan P. Adams

We demonstrate the utility of this new acquisition function by utilizing a small dataset in order to explore hyperparameter settings for a large dataset.

Ranked #88 on Image Classification on STL-10

no code implementations • NeurIPS 2012 • Kevin Swersky, Ilya Sutskever, Daniel Tarlow, Richard S. Zemel, Ruslan R. Salakhutdinov, Ryan P. Adams

The Restricted Boltzmann Machine (RBM) is a popular density model that is also good for extracting features.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.