Search Results for author: J. Zico Kolter

Found 114 papers, 72 papers with code

Contextually Supervised Source Separation with Application to Energy Disaggregation

no code implementations • 18 Dec 2013 • Matt Wytock, J. Zico Kolter

We propose a new framework for single-channel source separation that lies between the fully supervised and unsupervised setting.

Paper
Add Code

Probabilistic Segmentation via Total Variation Regularization

no code implementations • 16 Nov 2015 • Matt Wytock, J. Zico Kolter

We present a convex approach to probabilistic segmentation and modeling of time series data.

Density Estimation Segmentation +2

Paper
Add Code

Input Convex Neural Networks

3 code implementations • ICML 2017 • Brandon Amos, Lei Xu, J. Zico Kolter

We show that many existing neural network architectures can be made input-convex with a minor modification, and develop specialized optimization algorithms tailored to this setting.

Imputation Inference Optimization +3

268

Paper
Code

OptNet: Differentiable Optimization as a Layer in Neural Networks

6 code implementations • ICML 2017 • Brandon Amos, J. Zico Kolter

This paper presents OptNet, a network architecture that integrates optimization problems (here, specifically in the form of quadratic programs) as individual layers in larger end-to-end trainable deep networks.

Bilevel Optimization

822

Paper
Code

Task-based End-to-end Model Learning in Stochastic Optimization

1 code implementation • NeurIPS 2017 • Priya L. Donti, Brandon Amos, J. Zico Kolter

With the increasing popularity of machine learning techniques, it has become common to see prediction algorithms operating within some larger process.

BIG-bench Machine Learning Scheduling +1

191

Paper
Code

The Mixing method: low-rank coordinate descent for semidefinite programming with diagonal constraints

1 code implementation • 1 Jun 2017 • Po-Wei Wang, Wei-Cheng Chang, J. Zico Kolter

In this paper, we propose a low-rank coordinate descent approach to structured semidefinite programming with diagonal constraints.

Learning Word Embeddings

Paper
Code

Gradient descent GAN optimization is locally stable

1 code implementation • NeurIPS 2017 • Vaishnavh Nagarajan, J. Zico Kolter

Despite the growing prominence of generative adversarial networks (GANs), optimization in GANs is still a poorly understood topic.

Paper
Code

Provable defenses against adversarial examples via the convex outer adversarial polytope

8 code implementations • ICML 2018 • Eric Wong, J. Zico Kolter

We propose a method to learn deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations on the training data.

Adversarial Attack

374

Paper
Code

Convolutional Sequence Modeling Revisited

no code implementations • ICLR 2018 • Shaojie Bai, J. Zico Kolter, Vladlen Koltun

This paper revisits the problem of sequence modeling using convolutional architectures.

Ranked #84 on Language Modelling on WikiText-103

Language Modelling Time Series Analysis

Paper
Add Code

Realtime query completion via deep language models

no code implementations • ICLR 2018 • Po-Wei Wang, J. Zico Kolter, Vijai Mohan, Inderjit S. Dhillon

Search engine users nowadays heavily depend on query completion and correction to shape their queries.

Language Modelling

Paper
Add Code

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

32 code implementations • 4 Mar 2018 • Shaojie Bai, J. Zico Kolter, Vladlen Koltun

Our results indicate that a simple convolutional architecture outperforms canonical recurrent networks such as LSTMs across a diverse range of tasks and datasets, while demonstrating longer effective memory.

Ranked #4 on Music Modeling on Nottingham

Audio Synthesis Language Modelling +5

4,669

Paper
Code

What game are we playing? End-to-end learning in normal and extensive form games

1 code implementation • 7 May 2018 • Chun Kai Ling, Fei Fang, J. Zico Kolter

Although recent work in AI has made great progress in solving large, zero-sum, extensive-form games, the underlying assumption in most past work is that the parameters of the game itself are known to the agents.

Paper
Code

Scaling provable adversarial defenses

4 code implementations • NeurIPS 2018 • Eric Wong, Frank R. Schmidt, Jan Hendrik Metzen, J. Zico Kolter

Recent work has developed methods for learning deep network classifiers that are provably robust to norm-bounded adversarial perturbation; however, these methods are currently only possible for relatively small feedforward networks.

374

Paper
Code

Trellis Networks for Sequence Modeling

1 code implementation • ICLR 2019 • Shaojie Bai, J. Zico Kolter, Vladlen Koltun

On the other hand, we show that truncated recurrent networks are equivalent to trellis networks with special sparsity structure in their weight matrices.

Ranked #4 on Language Modelling on Penn Treebank (Character Level)

Language Modelling Sequential Image Classification

474

Paper
Code

A Continuous-Time View of Early Stopping for Least Squares

no code implementations • 23 Oct 2018 • Alnur Ali, J. Zico Kolter, Ryan J. Tibshirani

Our primary focus is to compare the risk of gradient flow to that of ridge regression.

regression

Paper
Add Code

Differentiable MPC for End-to-end Planning and Control

2 code implementations • NeurIPS 2018 • Brandon Amos, Ivan Dario Jimenez Rodriguez, Jacob Sacks, Byron Boots, J. Zico Kolter

We present foundations for using Model Predictive Control (MPC) as a differentiable policy class for reinforcement learning in continuous state and action spaces.

Imitation Learning Model Predictive Control

193

Paper
Code

End-to-End Differentiable Physics for Learning and Control

1 code implementation • NeurIPS 2018 • Filipe de Avila Belbute-Peres, Kevin Smith, Kelsey Allen, Josh Tenenbaum, J. Zico Kolter

We present a differentiable physics engine that can be integrated as a module in deep neural networks for end-to-end learning.

287

Paper
Code

Low-rank semidefinite programming for the MAX2SAT problem

1 code implementation • 15 Dec 2018 • Po-Wei Wang, J. Zico Kolter

This paper proposes a new algorithm for solving MAX2SAT problems based on combining search methods with semidefinite programming approaches.

Paper
Code

Generalization in Deep Networks: The Role of Distance from Initialization

no code implementations • 7 Jan 2019 • Vaishnavh Nagarajan, J. Zico Kolter

Why does training deep neural networks using stochastic gradient descent (SGD) result in a generalization error that does not worsen with the number of parameters in the network?

Paper
Add Code

Certified Adversarial Robustness via Randomized Smoothing

10 code implementations • 8 Feb 2019 • Jeremy M Cohen, Elan Rosenfeld, J. Zico Kolter

We show how to turn any classifier that classifies well under Gaussian noise into a new classifier that is certifiably robust to adversarial perturbations under the $\ell_2$ norm.

Ranked #3 on Robust classification on CIFAR-10

Adversarial Defense Adversarial Robustness +1

351

Paper
Code

Uniform convergence may be unable to explain generalization in deep learning

1 code implementation • NeurIPS 2019 • Vaishnavh Nagarajan, J. Zico Kolter

Aimed at explaining the surprisingly good generalization behavior of overparameterized deep networks, recent works have developed a variety of generalization bounds for deep learning, all based on the fundamental learning-theoretic technique of uniform convergence.

Generalization Bounds

Paper
Code

Wasserstein Adversarial Examples via Projected Sinkhorn Iterations

2 code implementations • 21 Feb 2019 • Eric Wong, Frank R. Schmidt, J. Zico Kolter

In this paper, we propose a new threat model for adversarial attacks based on the Wasserstein distance.

Adversarial Attack Adversarial Defense +4

Paper
Code

Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games

no code implementations • 11 Mar 2019 • Chun Kai Ling, Fei Fang, J. Zico Kolter

With the recent advances in solving large, zero-sum extensive form games, there is a growing interest in the inverse problem of inferring underlying game parameters given only access to agent actions.

Paper
Add Code

Adversarial camera stickers: A physical camera-based attack on deep learning systems

1 code implementation • 21 Mar 2019 • Juncheng Li, Frank R. Schmidt, J. Zico Kolter

In this work, we consider an alternative question: is it possible to fool deep classifiers, over all perceived objects of a certain type, by physically manipulating the camera itself?

Paper
Code

Deterministic PAC-Bayesian generalization bounds for deep networks via generalizing noise-resilience

no code implementations • ICLR 2019 • Vaishnavh Nagarajan, J. Zico Kolter

The ability of overparameterized deep networks to generalize well has been linked to the fact that stochastic gradient descent (SGD) finds solutions that lie in flat, wide minima in the training loss -- minima where the output of the network is resilient to small random noise added to its parameters.

Generalization Bounds

Paper
Add Code

Multimodal Transformer for Unaligned Multimodal Language Sequences

4 code implementations • ACL 2019 • Yao-Hung Hubert Tsai, Shaojie Bai, Paul Pu Liang, J. Zico Kolter, Louis-Philippe Morency, Ruslan Salakhutdinov

Human language is often multimodal, which comprehends a mixture of natural language, facial gestures, and acoustic behaviors.

Ranked #5 on Multimodal Sentiment Analysis on MOSI

Multimodal Sentiment Analysis Time Series +1

753

Paper
Code

Perceptual Based Adversarial Audio Attacks

no code implementations • 14 Jun 2019 • Joseph Szurley, J. Zico Kolter

Recent work has shown the possibility of adversarial attacks on automatic speechrecognition (ASR) systems.

Audio and Speech Processing Sound

Paper
Add Code

The Limited Multi-Label Projection Layer

1 code implementation • 20 Jun 2019 • Brandon Amos, Vladlen Koltun, J. Zico Kolter

We propose the Limited Multi-Label (LML) projection layer as a new primitive operation for end-to-end learning systems.

General Classification Graph Generation +1

Paper
Code

Deep Equilibrium Models

9 code implementations • NeurIPS 2019 • Shaojie Bai, J. Zico Kolter, Vladlen Koltun

We present a new approach to modeling sequential data: the deep equilibrium model (DEQ).

Ranked #29 on Language Modelling on Penn Treebank (Word Level)

Language Modelling

703

Paper
Code

Adversarial Robustness Against the Union of Multiple Perturbation Models

1 code implementation • 9 Sep 2019 • Pratyush Maini, Eric Wong, J. Zico Kolter

Owing to the susceptibility of deep learning systems to adversarial attacks, there has been a great deal of work in developing (both empirically and certifiably) robust classifiers.

Adversarial Robustness

Paper
Code

Certified Robustness to Adversarial Label-Flipping Attacks via Randomized Smoothing

no code implementations • 25 Sep 2019 • Elan Rosenfeld, Ezra Winston, Pradeep Ravikumar, J. Zico Kolter

This paper considers label-flipping attacks, a type of data poisoning attack where an adversary relabels a small number of examples in a training set in order to degrade the performance of the resulting classifier.

Binary Classification Data Poisoning

Paper
Add Code

Black-box Adversarial Attacks with Bayesian Optimization

1 code implementation • 30 Sep 2019 • Satya Narayan Shukla, Anit Kumar Sahu, Devin Willmott, J. Zico Kolter

We focus on the problem of black-box adversarial attacks, where the aim is to generate adversarial examples using information limited to loss function evaluations of input-output pairs.

Bayesian Optimization

Paper
Code

Adversarial Music: Real World Audio Adversary Against Wake-word Detection System

no code implementations • NeurIPS 2019 • Juncheng B. Li, Shuhui Qu, Xinjian Li, Joseph Szurley, J. Zico Kolter, Florian Metze

In this work, we target our attack on the wake-word detection system, jamming the model with some inconspicuous background music to deactivate the VAs while our audio adversary is present.

Real-World Adversarial Attack

Paper
Add Code

Dynamic Modeling and Equilibria in Fair Decision Making

no code implementations • 15 Nov 2019 • Joshua Williams, J. Zico Kolter

Recent studies on fairness in automated decision making systems have both investigated the potential future impact of these decisions on the population at large, and emphasized that imposing ''typical'' fairness constraints such as demographic parity or equality of opportunity does not guarantee a benefit to disadvantaged groups.

Decision Making Fairness

Paper
Add Code

AP-Perf: Incorporating Generic Performance Metrics in Differentiable Learning

4 code implementations • 2 Dec 2019 • Rizal Fathony, J. Zico Kolter

We propose a method that enables practitioners to conveniently incorporate custom non-decomposable performance metrics into differentiable learning pipelines, notably those based upon neural network architectures.

General Classification Image Classification

Paper
Code

Fast is better than free: Revisiting adversarial training

10 code implementations • ICLR 2020 • Eric Wong, Leslie Rice, J. Zico Kolter

Furthermore we show that FGSM adversarial training can be further accelerated by using standard techniques for efficient training of deep networks, allowing us to learn a robust CIFAR10 classifier with 45% robust accuracy to PGD attacks with $\epsilon=8/255$ in 6 minutes, and a robust ImageNet classifier with 43% robust accuracy at $\epsilon=2/255$ in 12 hours, in comparison to past work based on "free" adversarial training which took 10 and 50 hours to reach the same respective thresholds.

408

Paper
Code

Learning Stable Deep Dynamics Models

1 code implementation • NeurIPS 2019 • Gaurav Manek, J. Zico Kolter

Deep networks are commonly used to model dynamical systems, predicting how the state of a system will evolve over time (either autonomously or in response to control inputs).

Paper
Code

Certified Robustness to Label-Flipping Attacks via Randomized Smoothing

no code implementations • ICML 2020 • Elan Rosenfeld, Ezra Winston, Pradeep Ravikumar, J. Zico Kolter

Machine learning algorithms are known to be susceptible to data poisoning attacks, where an adversary manipulates the training data to degrade performance of the resulting classifier.

Data Poisoning General Classification +1

Paper
Add Code

Overfitting in adversarially robust deep learning

4 code implementations • ICML 2020 • Leslie Rice, Eric Wong, J. Zico Kolter

Based upon this observed effect, we show that the performance gains of virtually all recent algorithmic improvements upon adversarial training can be matched by simply using early stopping.

Data Augmentation

153

Paper
Code

Denoised Smoothing: A Provable Defense for Pretrained Classifiers

4 code implementations • NeurIPS 2020 • Hadi Salman, Ming-Jie Sun, Greg Yang, Ashish Kapoor, J. Zico Kolter

We present a method for provably defending any pretrained image classifier against $\ell_p$ adversarial attacks.

General Classification Image Classification +1

Paper
Code

Differentiable learning of numerical rules in knowledge graphs

no code implementations • ICLR 2020 • Po-Wei Wang, Daria Stepanova, Csaba Domokos, J. Zico Kolter

Rules over a knowledge graph (KG) capture interpretable patterns in data and can be used for KG cleaning and completion.

Knowledge Graphs

Paper
Add Code

Multiscale Deep Equilibrium Models

4 code implementations • NeurIPS 2020 • Shaojie Bai, Vladlen Koltun, J. Zico Kolter

These simultaneously-learned multi-resolution features allow us to train a single model on a diverse set of tasks and loss functions, such as using a single MDEQ to perform both image classification and semantic segmentation.

Ranked #46 on Semantic Segmentation on Cityscapes val

General Classification Image Classification +2

703

Paper
Code

Monotone operator equilibrium networks

1 code implementation • NeurIPS 2020 • Ezra Winston, J. Zico Kolter

We then develop a parameterization of the network which ensures that all operators remain monotone, which guarantees the existence of a unique equilibrium point.

Paper
Code

Neural Network Virtual Sensors for Fuel Injection Quantities with Provable Performance Specifications

no code implementations • 30 Jun 2020 • Eric Wong, Tim Schneider, Joerg Schmitt, Frank R. Schmidt, J. Zico Kolter

Additionally, we show how specific intervals of fuel injection quantities can be targeted to maximize robustness for certain ranges, allowing us to train a virtual sensor for fuel injection which is provably guaranteed to have at most 10. 69% relative error under noise while maintaining 3% relative error on non-adversarial data within normalized fuel injection ranges of 0. 6 to 1. 0.

Paper
Add Code

Provably Safe PAC-MDP Exploration Using Analogies

1 code implementation • 7 Jul 2020 • Melrose Roderick, Vaishnavh Nagarajan, J. Zico Kolter

A key challenge in applying reinforcement learning to safety-critical domains is understanding how to balance exploration (needed to attain good performance on the task) with safety (needed to avoid catastrophic failure).

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Combining Differentiable PDE Solvers and Graph Neural Networks for Fluid Flow Prediction

2 code implementations • ICML 2020 • Filipe de Avila Belbute-Peres, Thomas D. Economon, J. Zico Kolter

Solving large complex partial differential equations (PDEs), such as those that arise in computational fluid dynamics (CFD), is a computationally expensive process.

234

Paper
Code

Simple and Efficient Hard Label Black-box Adversarial Attacks in Low Query Budget Regimes

1 code implementation • 13 Jul 2020 • Satya Narayan Shukla, Anit Kumar Sahu, Devin Willmott, J. Zico Kolter

We focus on the problem of black-box adversarial attacks, where the aim is to generate adversarial examples for deep learning models solely based on information limited to output label~(hard label) to a queried data input.

Bayesian Optimization

Paper
Code

Learning perturbation sets for robust machine learning

1 code implementation • ICLR 2021 • Eric Wong, J. Zico Kolter

In this paper, we aim to bridge this gap by learning perturbation sets from data, in order to characterize real-world effects for robust training and evaluation.

BIG-bench Machine Learning

Paper
Code

Gaussian MRF Covariance Modeling for Efficient Black-Box Adversarial Attacks

1 code implementation • 8 Oct 2020 • Anit Kumar Sahu, Satya Narayan Shukla, J. Zico Kolter

We study the problem of generating adversarial examples in a black-box setting, where we only have access to a zeroth order oracle, providing us with loss function evaluations.

Paper
Code

Poisoned classifiers are not only backdoored, they are fundamentally broken

1 code implementation • 18 Oct 2020 • MingJie Sun, Siddhant Agarwal, J. Zico Kolter

Under this threat model, we propose a test-time, human-in-the-loop attack method to generate multiple effective alternative triggers without access to the initial backdoor and the training data.

Paper
Code

Enforcing robust control guarantees within neural network policies

1 code implementation • ICLR 2021 • Priya L. Donti, Melrose Roderick, Mahyar Fazlyab, J. Zico Kolter

When designing controllers for safety-critical systems, practitioners often face a challenging tradeoff between robustness and performance.

Paper
Code

Challenging common interpretability assumptions in feature attribution explanations

1 code implementation • 4 Dec 2020 • Jonathan Dinu, Jeffrey Bigham, J. Zico Kolter

As machine learning and algorithmic decision making systems are increasingly being leveraged in high-stakes human-in-the-loop settings, there is a pressing need to understand the rationale of their predictions.

Decision Making Explainable Artificial Intelligence (XAI) +1

Paper
Code

Community detection using fast low-cardinality semidefinite programming

1 code implementation • NeurIPS 2020 • Po-Wei Wang, J. Zico Kolter

Modularity maximization has been a fundamental tool for understanding the community structure of a network, but the underlying optimization problem is nonconvex and NP-hard to solve.

Community Detection

Paper
Code

Efficient semidefinite-programming-based inference for binary and multi-class MRFs

1 code implementation • NeurIPS 2020 • Chirag Pabbaraju, Po-Wei Wang, J. Zico Kolter

Probabilistic inference in pairwise Markov Random Fields (MRFs), i. e. computing the partition function or computing a MAP estimate of the variables, is a foundational problem in probabilistic graphical models.

Paper
Code

Deep Archimedean Copulas

1 code implementation • NeurIPS 2020 • Chun Kai Ling, Fei Fang, J. Zico Kolter

A central problem in machine learning and statistics is to model joint densities of random variables from data.

Paper
Code

On Proximal Policy Optimization's Heavy-tailed Gradients

no code implementations • 20 Feb 2021 • Saurabh Garg, Joshua Zhanson, Emilio Parisotto, Adarsh Prasad, J. Zico Kolter, Zachary C. Lipton, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Pradeep Ravikumar

In this paper, we present a detailed empirical study to characterize the heavy-tailed nature of the gradients of the PPO surrogate reward function.

Continuous Control

Paper
Add Code

Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability

1 code implementation • ICLR 2021 • Jeremy M. Cohen, Simran Kaur, Yuanzhi Li, J. Zico Kolter, Ameet Talwalkar

We empirically demonstrate that full-batch gradient descent on neural network training objectives typically operates in a regime we call the Edge of Stability.

Paper
Code

Beta-CROWN: Efficient Bound Propagation with Per-neuron Split Constraints for Complete and Incomplete Neural Network Robustness Verification

4 code implementations • NeurIPS 2021 • Shiqi Wang, huan zhang, Kaidi Xu, Xue Lin, Suman Jana, Cho-Jui Hsieh, J. Zico Kolter

Compared to the typically tightest but very costly semidefinite programming (SDP) based incomplete verifiers, we obtain higher verified accuracy with three orders of magnitudes less verification time.

Adversarial Attack

306

Paper
Code

Orthogonalizing Convolutional Layers with the Cayley Transform

1 code implementation • ICLR 2021 • Asher Trockman, J. Zico Kolter

Recent work has highlighted several advantages of enforcing orthogonality in the weight layers of deep networks, such as maintaining the stability of activations, preserving gradient norms, and enhancing adversarial robustness by enforcing low Lipschitz constants.

Adversarial Robustness

Paper
Code

DC3: A learning method for optimization with hard constraints

1 code implementation • ICLR 2021 • Priya L. Donti, David Rolnick, J. Zico Kolter

Large optimization problems with hard constraints arise in many settings, yet classical solvers are often prohibitively slow, motivating the use of deep networks as cheap "approximate solvers."

119

Paper
Code

RATT: Leveraging Unlabeled Data to Guarantee Generalization

1 code implementation • 1 May 2021 • Saurabh Garg, Sivaraman Balakrishnan, J. Zico Kolter, Zachary C. Lipton

To assess generalization, machine learning scientists typically either (i) bound the generalization gap and then (after training) plug in the empirical risk to obtain a bound on the true risk; or (ii) validate empirically on holdout data.

Generalization Bounds Holdout Set +1

Paper
Code

Enforcing Policy Feasibility Constraints through Differentiable Projection for Energy Optimization

1 code implementation • 19 May 2021 • Bingqing Chen, Priya Donti, Kyri Baker, J. Zico Kolter, Mario Berges

Specifically, we incorporate a differentiable projection layer within a neural network-based policy to enforce that all learned actions are feasible.

Reinforcement Learning (RL)

Paper
Code

DORO: Distributional and Outlier Robust Optimization

1 code implementation • 11 Jun 2021 • Runtian Zhai, Chen Dan, J. Zico Kolter, Pradeep Ravikumar

Many machine learning tasks involve subpopulation shift where the testing data distribution is a subpopulation of the training distribution.

Open-Ended Question Answering

Paper
Code

DeepSplit: Scalable Verification of Deep Neural Networks via Operator Splitting

1 code implementation • 16 Jun 2021 • Shaoru Chen, Eric Wong, J. Zico Kolter, Mahyar Fazlyab

Analyzing the worst-case performance of deep neural networks against input perturbations amounts to solving a large-scale non-convex optimization problem, for which several past works have proposed convex relaxations as a promising alternative.

Image Classification

Paper
Code

Assessing Generalization of SGD via Disagreement

no code implementations • ICLR 2022 • Yiding Jiang, Vaishnavh Nagarajan, Christina Baek, J. Zico Kolter

We empirically show that the test error of deep networks can be estimated by simply training the same architecture on the same training set but with a different run of Stochastic Gradient Descent (SGD), and measuring the disagreement rate between the two networks on unlabeled test data.

Paper
Add Code

Stabilizing Equilibrium Models by Jacobian Regularization

1 code implementation • 28 Jun 2021 • Shaojie Bai, Vladlen Koltun, J. Zico Kolter

Deep equilibrium networks (DEQs) are a new class of models that eschews traditional depth in favor of finding the fixed point of a single nonlinear layer.

Language Modelling

703

Paper
Code

Communicating via Markov Decision Processes

1 code implementation • 17 Jul 2021 • Samuel Sokota, Christian Schroeder de Witt, Maximilian Igl, Luisa Zintgraf, Philip Torr, Martin Strohmeier, J. Zico Kolter, Shimon Whiteson, Jakob Foerster

We contribute a theoretically grounded approach to MCGs based on maximum entropy reinforcement learning and minimum entropy coupling that we call MEME.

Multi-agent Reinforcement Learning

Paper
Code

Adversarially Robust Learning for Security-Constrained Optimal Power Flow

no code implementations • NeurIPS 2021 • Priya L. Donti, Aayushya Agarwal, Neeraj Vijay Bedmutha, Larry Pileggi, J. Zico Kolter

In recent years, the ML community has seen surges of interest in both adversarially robust learning and implicit layers, but connections between these two areas have seldom been explored.

Paper
Add Code

Joint inference and input optimization in equilibrium networks

1 code implementation • NeurIPS 2021 • Swaminathan Gurumurthy, Shaojie Bai, Zachary Manchester, J. Zico Kolter

Many tasks in deep learning involve optimizing over the \emph{inputs} to a network to minimize or maximize some objective; examples include optimization over latent spaces in a generative model to match a target image, or adversarially perturbing an input to worsen classifier performance.

Denoising Meta-Learning

Paper
Code

$(\textrm{Implicit})^2$: Implicit Layers for Implicit Representations

no code implementations • NeurIPS 2021 • Zhichun Huang, Shaojie Bai, J. Zico Kolter

Recent research in deep learning has investigated two very different forms of ''implicitness'': implicit representations model high-frequency data such as images or 3D shapes directly via a low-dimensional neural network (often using e. g., sinusoidal bases or nonlinearities); implicit layers, in contrast, refer to techniques where the forward pass of a network is computed via non-linear dynamical systems, such as fixed-point or differential equation solutions, with the backward pass computed via the implicit function theorem.

Paper
Add Code

Monte Carlo Tree Search With Iteratively Refining State Abstractions

no code implementations • NeurIPS 2021 • Samuel Sokota, Caleb Ho, Zaheen Ahmad, J. Zico Kolter

In this work, we present a method, called abstraction refining, for extending MCTS to stochastic environments which, unlike progressive widening, leverages the geometry of the state space.

Paper
Add Code

Robustness between the worst and average case

no code implementations • NeurIPS 2021 • Leslie Rice, Anna Bair, huan zhang, J. Zico Kolter

Several recent works in machine learning have focused on evaluating the test-time robustness of a classifier: how well the classifier performs not just on the target domain it was trained upon, but upon perturbed examples.

Adversarial Robustness

Paper
Add Code

Patches Are All You Need?

11 code implementations • 24 Jan 2022 • Asher Trockman, J. Zico Kolter

Despite its simplicity, we show that the ConvMixer outperforms the ViT, MLP-Mixer, and some of their variants for similar parameter counts and data set sizes, in addition to outperforming classical vision models such as the ResNet.

Ranked #96 on Image Classification on CIFAR-10

Image Classification

47,783

Paper
Code

Deep Equilibrium Optical Flow Estimation

1 code implementation • CVPR 2022 • Shaojie Bai, Zhengyang Geng, Yash Savani, J. Zico Kolter

Many recent state-of-the-art (SOTA) optical flow models use finite-step recurrent update operations to emulate traditional algorithms by encouraging iterative refinements toward a stable flow estimation.

Ranked #1 on Optical Flow Estimation on KITTI 2015 (train)

Optical Flow Estimation

175

Paper
Code

Smooth-Reduce: Leveraging Patches for Improved Certified Robustness

no code implementations • 12 May 2022 • Ameya Joshi, Minh Pham, Minsu Cho, Leonid Boytsov, Filipe Condessa, J. Zico Kolter, Chinmay Hegde

Randomized smoothing (RS) has been shown to be a fast, scalable technique for certifying the robustness of deep neural network classifiers.

Paper
Add Code

A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games

3 code implementations • 12 Jun 2022 • Samuel Sokota, Ryan D'Orazio, J. Zico Kolter, Nicolas Loizou, Marc Lanctot, Ioannis Mitliagkas, Noam Brown, Christian Kroer

This work studies an algorithm, which we call magnetic mirror descent, that is inspired by mirror descent and the non-Euclidean proximal gradient algorithm.

MuJoCo Games reinforcement-learning +1

3,995

Paper
Code

(Certified!!) Adversarial Robustness for Free!

1 code implementation • 21 Jun 2022 • Nicholas Carlini, Florian Tramer, Krishnamurthy Dj Dvijotham, Leslie Rice, MingJie Sun, J. Zico Kolter

In this paper we show how to achieve state-of-the-art certified adversarial robustness to 2-norm bounded perturbations by relying exclusively on off-the-shelf pretrained models.

Adversarial Robustness Denoising

Paper
Code

General Cutting Planes for Bound-Propagation-Based Neural Network Verification

2 code implementations • 11 Aug 2022 • huan zhang, Shiqi Wang, Kaidi Xu, Linyi Li, Bo Li, Suman Jana, Cho-Jui Hsieh, J. Zico Kolter

Our generalized bound propagation method, GCP-CROWN, opens up the opportunity to apply general cutting plane methods for neural network verification while benefiting from the efficiency and GPU acceleration of bound propagation methods.

205

Paper
Code

Understanding the Covariance Structure of Convolutional Filters

no code implementations • 7 Oct 2022 • Asher Trockman, Devin Willmott, J. Zico Kolter

In this work, we first observe that such learned filters have highly-structured covariance matrices, and moreover, we find that covariances calculated from small networks may be used to effectively initialize a variety of larger networks of different depths, widths, patch sizes, and kernel sizes, indicating a degree of model-independence to the covariance structure.

Paper
Add Code

Perfectly Secure Steganography Using Minimum Entropy Coupling

1 code implementation • 24 Oct 2022 • Christian Schroeder de Witt, Samuel Sokota, J. Zico Kolter, Jakob Foerster, Martin Strohmeier

Steganography is the practice of encoding secret information into innocuous content in such a manner that an adversarial third party would not realize that there is hidden meaning.

Paper
Code

Characterizing Datapoints via Second-Split Forgetting

1 code implementation • 26 Oct 2022 • Pratyush Maini, Saurabh Garg, Zachary C. Lipton, J. Zico Kolter

Popular metrics derived from these dynamics include (i) the epoch at which examples are first correctly classified; (ii) the number of times their predictions flip during training; and (iii) whether their prediction flips if they are held out.

Paper
Code

Simple initialization and parametrization of sinusoidal networks via their kernel bandwidth

no code implementations • 26 Nov 2022 • Filipe de Avila Belbute-Peres, J. Zico Kolter

Neural networks with sinusoidal activations have been proposed as an alternative to networks with traditional activation functions.

Paper
Add Code

Losses over Labels: Weakly Supervised Learning via Direct Loss Construction

1 code implementation • 13 Dec 2022 • Dylan Sam, J. Zico Kolter

Owing to the prohibitive costs of generating large amounts of labeled data, programmatic weak supervision is a growing paradigm within machine learning.

feature selection Image Classification +1

Paper
Code

Function Approximation for Solving Stackelberg Equilibrium in Large Perfect Information Games

1 code implementation • 29 Dec 2022 • Chun Kai Ling, J. Zico Kolter, Fei Fang

Function approximation (FA) has been a critical component in solving large zero-sum games.

Paper
Code

Abstracting Imperfect Information Away from Two-Player Zero-Sum Games

no code implementations • 22 Jan 2023 • Samuel Sokota, Ryan D'Orazio, Chun Kai Ling, David J. Wu, J. Zico Kolter, Noam Brown

Because these regularized equilibria can be made arbitrarily close to Nash equilibria, our result opens the door to a new perspective to solving two-player zero-sum games and yields a simplified framework for decision-time planning in two-player zero-sum games, void of the unappealing properties that plague existing decision-time planning approaches.

Vocal Bursts Valence Prediction

Paper
Add Code

Single Image Backdoor Inversion via Robust Smoothed Classifiers

1 code implementation • CVPR 2023 • MingJie Sun, J. Zico Kolter

Insipired by recent advances in adversarial robustness, our method SmoothInv starts from a single clean image, and then performs projected gradient descent towards the target class on a robust smoothed version of the original backdoored classifier.

Adversarial Robustness Image Generation

Paper
Code

Model-tuning Via Prompts Makes NLP Models Adversarially Robust

1 code implementation • 13 Mar 2023 • Mrigank Raman, Pratyush Maini, J. Zico Kolter, Zachary C. Lipton, Danish Pruthi

Across 5 NLP datasets, 4 adversarial attacks, and 3 different models, MVP improves performance against adversarial substitutions by an average of 8% over standard methods and even outperforms adversarial training-based state-of-art defenses by 3. 5%.

Adversarial Robustness Language Modelling +1

Paper
Code

Sinkhorn-Flow: Predicting Probability Mass Flow in Dynamical Systems Using Optimal Transport

no code implementations • 14 Mar 2023 • Mukul Bhutani, J. Zico Kolter

Predicting how distributions over discrete variables vary over time is a common task in time series forecasting.

Time Series Time Series Forecasting

Paper
Add Code

Learning with Explanation Constraints

no code implementations • NeurIPS 2023 • Rattana Pukdee, Dylan Sam, J. Zico Kolter, Maria-Florina Balcan, Pradeep Ravikumar

In this paper, we formalize this notion as learning from explanation constraints and provide a learning theoretic framework to analyze how such explanations can improve the learning of our models.

Paper
Add Code

The Update-Equivalence Framework for Decision-Time Planning

no code implementations • 25 Apr 2023 • Samuel Sokota, Gabriele Farina, David J. Wu, Hengyuan Hu, Kevin A. Wang, J. Zico Kolter, Noam Brown

Using this framework, we derive a provably sound search algorithm for fully cooperative games based on mirror descent and a search algorithm for adversarial games based on magnetic mirror descent.

Paper
Add Code

Mimetic Initialization of Self-Attention Layers

no code implementations • 16 May 2023 • Asher Trockman, J. Zico Kolter

It is notoriously difficult to train Transformers on small datasets; typically, large pre-trained models are instead used as the starting point.

Paper
Add Code

On the Joint Interaction of Models, Data, and Features

no code implementations • 7 Jun 2023 • Yiding Jiang, Christina Baek, J. Zico Kolter

Thus, we believe this work provides valuable new insight into our understanding of feature learning.

Paper
Add Code

A Simple and Effective Pruning Approach for Large Language Models

3 code implementations • 20 Jun 2023 • MingJie Sun, Zhuang Liu, Anna Bair, J. Zico Kolter

Motivated by the recent observation of emergent large magnitude features in LLMs, our approach prunes weights with the smallest magnitudes multiplied by the corresponding input activations, on a per-output basis.

Network Pruning

517

Paper
Code

Localized Text-to-Image Generation for Free via Cross Attention Control

no code implementations • 26 Jun 2023 • Yutong He, Ruslan Salakhutdinov, J. Zico Kolter

Despite the tremendous success in text-to-image generative models, localized text-to-image generation (that is, generating objects or features at specific locations in an image while maintaining a consistent overall generation) still requires either explicit training or substantial additional inference time.

Semantic Segmentation Text-to-Image Generation

Paper
Add Code

T-MARS: Improving Visual Representations by Circumventing Text Feature Learning

1 code implementation • 6 Jul 2023 • Pratyush Maini, Sachin Goyal, Zachary C. Lipton, J. Zico Kolter, aditi raghunathan

However, naively removing all such data could also be wasteful, as it throws away images that contain visual features (in addition to overlapping text).

Optical Character Recognition

Paper
Code

Text Descriptions are Compressive and Invariant Representations for Visual Learning

no code implementations • 10 Jul 2023 • Zhili Feng, Anna Bair, J. Zico Kolter

This method first automatically generates multiple visual descriptions of each class via a large language model (LLM), then uses a VLM to translate these descriptions to a set of visual feature embeddings of each image, and finally uses sparse logistic regression to select a relevant subset of these features to classify each image.

Descriptive Few-Shot Learning +5

Paper
Add Code

Monotone deep Boltzmann machines

no code implementations • 11 Jul 2023 • Zhili Feng, Ezra Winston, J. Zico Kolter

Deep Boltzmann machines (DBMs), one of the first ``deep'' learning methods ever studied, are multi-layered probabilistic models governed by a pairwise energy function that describes the likelihood of all variables/nodes in the network.

Paper
Add Code

Can Neural Network Memorization Be Localized?

1 code implementation • 18 Jul 2023 • Pratyush Maini, Michael C. Mozer, Hanie Sedghi, Zachary C. Lipton, J. Zico Kolter, Chiyuan Zhang

Recent efforts at explaining the interplay of memorization and generalization in deep overparametrized networks have posited that neural networks $\textit{memorize}$ "hard" examples in the final few layers of the model.

Memorization

Paper
Code

Universal and Transferable Adversarial Attacks on Aligned Language Models

11 code implementations • 27 Jul 2023 • Andy Zou, Zifan Wang, Nicholas Carlini, Milad Nasr, J. Zico Kolter, Matt Fredrikson

Specifically, our approach finds a suffix that, when attached to a wide range of queries for an LLM to produce objectionable content, aims to maximize the probability that the model produces an affirmative response (rather than refusing to answer).

Adversarial Attack

2,840

Paper
Code

Representation Engineering: A Top-Down Approach to AI Transparency

1 code implementation • 2 Oct 2023 • Andy Zou, Long Phan, Sarah Chen, James Campbell, Phillip Guo, Richard Ren, Alexander Pan, Xuwang Yin, Mantas Mazeika, Ann-Kathrin Dombrowski, Shashwat Goel, Nathaniel Li, Michael J. Byun, Zifan Wang, Alex Mallen, Steven Basart, Sanmi Koyejo, Dawn Song, Matt Fredrikson, J. Zico Kolter, Dan Hendrycks

In this paper, we identify and characterize the emerging area of representation engineering (RepE), an approach to enhancing the transparency of AI systems that draws on insights from cognitive neuroscience.

Ranked #3 on Question Answering on TruthfulQA

Question Answering

537

Paper
Code

Understanding prompt engineering may not require rethinking generalization

no code implementations • 6 Oct 2023 • Victor Akinwande, Yiding Jiang, Dylan Sam, J. Zico Kolter

Zero-shot learning in prompted vision-language models, the practice of crafting prompts to build classifiers without an explicit training process, has achieved impressive performance in many settings.

Generalization Bounds Language Modelling +3

Paper
Add Code

On the Neural Tangent Kernel of Equilibrium Models

no code implementations • 21 Oct 2023 • Zhili Feng, J. Zico Kolter

This work studies the neural tangent kernel (NTK) of the deep equilibrium (DEQ) model, a practical ``infinite-depth'' architecture which directly computes the infinite-depth limit of a weight-tied network via root-finding.

Paper
Add Code

TorchDEQ: A Library for Deep Equilibrium Models

1 code implementation • 28 Oct 2023 • Zhengyang Geng, J. Zico Kolter

Deep Equilibrium (DEQ) Models, an emerging class of implicit models that maps inputs to fixed points of neural networks, are of growing interest in the deep learning community.

Paper
Code

Projected Off-Policy Q-Learning (POP-QL) for Stabilizing Offline Reinforcement Learning

no code implementations • 25 Nov 2023 • Melrose Roderick, Gaurav Manek, Felix Berkenkamp, J. Zico Kolter

A key problem in off-policy Reinforcement Learning (RL) is the mismatch, or distribution shift, between the dataset and the distribution over states and actions visited by the learned policy.

Q-Learning Reinforcement Learning (RL)

Paper
Add Code

Manifold Preserving Guided Diffusion

no code implementations • 28 Nov 2023 • Yutong He, Naoki Murata, Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Dongjun Kim, Wei-Hsiang Liao, Yuki Mitsufuji, J. Zico Kolter, Ruslan Salakhutdinov, Stefano Ermon

Despite the recent advancements, conditional image generation still faces challenges of cost, generalizability, and the need for task-specific training.

Conditional Image Generation

Paper
Add Code

Deep Equilibrium Based Neural Operators for Steady-State PDEs

no code implementations • NeurIPS 2023 • Tanya Marwah, Ashwini Pokle, J. Zico Kolter, Zachary C. Lipton, Jianfeng Lu, Andrej Risteski

Motivated by this observation, we propose FNO-DEQ, a deep equilibrium variant of the FNO architecture that directly solves for the solution of a steady-state PDE as the infinite-depth fixed point of an implicit operator layer using a black-box root solver and differentiates analytically through this fixed point resulting in $\mathcal{O}(1)$ training memory.

Paper
Add Code

One-Step Diffusion Distillation via Deep Equilibrium Models

1 code implementation • NeurIPS 2023 • Zhengyang Geng, Ashwini Pokle, J. Zico Kolter

We demonstrate that the DEQ architecture is crucial to this capability, as GET matches a $5\times$ larger ViT in terms of FID scores while striking a critical balance of computational cost and image quality.

Paper
Code

TOFU: A Task of Fictitious Unlearning for LLMs

no code implementations • 11 Jan 2024 • Pratyush Maini, Zhili Feng, Avi Schwarzschild, Zachary C. Lipton, J. Zico Kolter

Large language models trained on massive corpora of data from the web can memorize and reproduce sensitive or private data raising both legal and ethical concerns.

Paper
Add Code

An Axiomatic Approach to Model-Agnostic Concept Explanations

no code implementations • 12 Jan 2024 • Zhili Feng, Michal Moshkovitz, Dotan Di Castro, J. Zico Kolter

Concept explanation is a popular approach for examining how human-interpretable concepts impact the predictions of a model.

Model Selection

Paper
Add Code

Bayesian Neural Networks with Domain Knowledge Priors

no code implementations • 20 Feb 2024 • Dylan Sam, Rattana Pukdee, Daniel P. Jeong, Yewon Byun, J. Zico Kolter

Bayesian neural networks (BNNs) have recently gained popularity due to their ability to quantify model uncertainty.

Fairness Variational Inference

Paper
Add Code

Massive Activations in Large Language Models

1 code implementation • 27 Feb 2024 • MingJie Sun, Xinlei Chen, J. Zico Kolter, Zhuang Liu

We observe an empirical phenomenon in Large Language Models (LLMs) -- very few activations exhibit significantly larger values than others (e. g., 100, 000 times larger).

Paper
Code

AcceleratedLiNGAM: Learning Causal DAGs at the speed of GPUs

1 code implementation • 6 Mar 2024 • Victor Akinwande, J. Zico Kolter

Existing causal discovery methods based on combinatorial optimization or search are slow, prohibiting their application on large-scale datasets.

Causal Discovery Causal Inference +1

Paper
Code

Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation

no code implementations • 28 Mar 2024 • Yutong He, Alexander Robey, Naoki Murata, Yiding Jiang, Joshua Williams, George J. Pappas, Hamed Hassani, Yuki Mitsufuji, Ruslan Salakhutdinov, J. Zico Kolter

Prompt engineering is effective for controlling the output of text-to-image (T2I) generative models, but it is also laborious due to the need for manually crafted prompts.

In-Context Learning Language Modelling +3

Paper
Add Code

Scaling Laws for Data Filtering -- Data Curation cannot be Compute Agnostic

2 code implementations • 10 Apr 2024 • Sachin Goyal, Pratyush Maini, Zachary C. Lipton, aditi raghunathan, J. Zico Kolter

Vision-language models (VLMs) are trained for thousands of GPU hours on carefully curated web datasets.

140

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.