Search Results for author: J. Zico Kolter

Found 132 papers, 82 papers with code

Diffusing Differentiable Representations

no code implementations9 Dec 2024 Yash Savani, Marc Finzi, J. Zico Kolter

We introduce a novel, training-free method for sampling differentiable representations (diffreps) using pretrained diffusion models.

Blind Inverse Problem Solving Made Easy by Text-to-Image Latent Diffusion

no code implementations30 Nov 2024 Michail Dontas, Yutong He, Naoki Murata, Yuki Mitsufuji, J. Zico Kolter, Ruslan Salakhutdinov

Blind inverse problems, where both the target data and forward operator are unknown, are crucial to many computer vision applications.

Image Restoration

Inference Optimal VLMs Need Only One Visual Token but Larger Models

1 code implementation5 Nov 2024 Kevin Y. Li, Sachin Goyal, Joao D. Semedo, J. Zico Kolter

Our results reveal a surprising trend: for visual reasoning tasks, the inference-optimal behavior in VLMs, i. e., minimum downstream error at any given fixed inference compute, is achieved when using the largest LLM that fits within the inference budget while minimizing visual token count - often to a single token.

Token Reduction Visual Reasoning

One-Step Diffusion Distillation through Score Implicit Matching

1 code implementation22 Oct 2024 Weijian Luo, Zemin Huang, Zhengyang Geng, J. Zico Kolter, Guo-Jun Qi

In this paper, we present Score Implicit Matching (SIM) a new approach to distilling pre-trained diffusion models into single-step generator models, while maintaining almost the same sample generation ability as the original model as well as being data-free with no need of training samples for distillation.

Rethinking Distance Metrics for Counterfactual Explainability

no code implementations18 Oct 2024 Joshua Nathaniel Williams, Anurag Katakkar, Hoda Heidari, J. Zico Kolter

Counterfactual explanations have been a popular method of post-hoc explainability for a variety of settings in Machine Learning.

counterfactual

Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws

1 code implementation15 Oct 2024 Yiding Jiang, Allan Zhou, Zhili Feng, Sadhika Malladi, J. Zico Kolter

The composition of pretraining data is a key determinant of foundation models' performance, but there is no standard guideline for allocating a limited computational budget across different data sources.

Computational Efficiency

Mimetic Initialization Helps State Space Models Learn to Recall

no code implementations14 Oct 2024 Asher Trockman, Hrayr Harutyunyan, J. Zico Kolter, Sanjiv Kumar, Srinadh Bhojanapalli

Recent work has shown that state space models such as Mamba are significantly worse than Transformers on recall-based tasks due to the fact that their state size is constant with respect to their input sequence length.

Mamba State Space Models

Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance

1 code implementation14 Oct 2024 Sachin Goyal, Christina Baek, J. Zico Kolter, aditi raghunathan

However, models struggle to reliably follow the input context, especially when it conflicts with their parametric knowledge from pretraining.

Finetuning CLIP to Reason about Pairwise Differences

1 code implementation15 Sep 2024 Dylan Sam, Devin Willmott, Joao D. Semedo, J. Zico Kolter

A notable drawback of CLIP, however, is that the resulting embedding space seems to lack some of the structure of their purely text-based alternatives.

Attribute Contrastive Learning +2

Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models

no code implementations19 Aug 2024 Aviv Bick, Kevin Y. Li, Eric P. Xing, J. Zico Kolter, Albert Gu

In this work, we present a method that is able to distill a pretrained Transformer architecture into alternative architectures such as state space models (SSMs).

Language Modelling Mamba +1

Prompt Recovery for Image Generation Models: A Comparative Study of Discrete Optimizers

no code implementations12 Aug 2024 Joshua Nathaniel Williams, Avi Schwarzschild, J. Zico Kolter

Recovering natural language prompts for image generation models, solely based on the generated images is a difficult discrete optimization problem.

Image Generation

FUSE-ing Language Models: Zero-Shot Adapter Discovery for Prompt Optimization Across Tokenizers

1 code implementation9 Aug 2024 Joshua Nathaniel Williams, J. Zico Kolter

The widespread use of large language models has resulted in a multitude of tokenizers and embedding spaces, making knowledge transfer in prompt discovery tasks difficult.

Image Captioning Transfer Learning

Understanding Hallucinations in Diffusion Models through Mode Interpolation

1 code implementation13 Jun 2024 Sumukh K Aithal, Pratyush Maini, Zachary C. Lipton, J. Zico Kolter

Specifically, we find that diffusion models smoothly "interpolate" between nearby data modes in the training set, to generate samples that are completely outside the support of the original training distribution; this phenomenon leads diffusion models to generate artifacts that never existed in real data (i. e., hallucinations).

Hallucination Image Generation

Rethinking LLM Memorization through the Lens of Adversarial Compression

no code implementations23 Apr 2024 Avi Schwarzschild, Zhili Feng, Pratyush Maini, Zachary C. Lipton, J. Zico Kolter

The ACR overcomes the limitations of existing notions of memorization by (i) offering an adversarial view of measuring memorization, especially for monitoring unlearning and compliance; and (ii) allowing for the flexibility to measure memorization for arbitrary strings at a reasonably low compute.

Memorization

Scaling Laws for Data Filtering -- Data Curation cannot be Compute Agnostic

1 code implementation10 Apr 2024 Sachin Goyal, Pratyush Maini, Zachary C. Lipton, aditi raghunathan, J. Zico Kolter

Vision-language models (VLMs) are trained for thousands of GPU hours on carefully curated web datasets.

Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation

no code implementations28 Mar 2024 Yutong He, Alexander Robey, Naoki Murata, Yiding Jiang, Joshua Nathaniel Williams, George J. Pappas, Hamed Hassani, Yuki Mitsufuji, Ruslan Salakhutdinov, J. Zico Kolter

Prompt engineering is effective for controlling the output of text-to-image (T2I) generative models, but it is also laborious due to the need for manually crafted prompts.

In-Context Learning Language Modelling +3

AcceleratedLiNGAM: Learning Causal DAGs at the speed of GPUs

1 code implementation6 Mar 2024 Victor Akinwande, J. Zico Kolter

Existing causal discovery methods based on combinatorial optimization or search are slow, prohibiting their application on large-scale datasets.

Causal Discovery Causal Inference +1

Massive Activations in Large Language Models

4 code implementations27 Feb 2024 MingJie Sun, Xinlei Chen, J. Zico Kolter, Zhuang Liu

We observe an empirical phenomenon in Large Language Models (LLMs) -- very few activations exhibit significantly larger values than others (e. g., 100, 000 times larger).

Bayesian Neural Networks with Domain Knowledge Priors

no code implementations20 Feb 2024 Dylan Sam, Rattana Pukdee, Daniel P. Jeong, Yewon Byun, J. Zico Kolter

Bayesian neural networks (BNNs) have recently gained popularity due to their ability to quantify model uncertainty.

Fairness Variational Inference

An Axiomatic Approach to Model-Agnostic Concept Explanations

no code implementations12 Jan 2024 Zhili Feng, Michal Moshkovitz, Dotan Di Castro, J. Zico Kolter

Concept explanation is a popular approach for examining how human-interpretable concepts impact the predictions of a model.

Model Selection

TOFU: A Task of Fictitious Unlearning for LLMs

3 code implementations11 Jan 2024 Pratyush Maini, Zhili Feng, Avi Schwarzschild, Zachary C. Lipton, J. Zico Kolter

Large language models trained on massive corpora of data from the web can memorize and reproduce sensitive or private data raising both legal and ethical concerns.

Scaling Laws for Data Filtering-- Data Curation cannot be Compute Agnostic

no code implementations CVPR 2024 Sachin Goyal, Pratyush Maini, Zachary C. Lipton, aditi raghunathan, J. Zico Kolter

Our work bridges this important gap in the literature by developing scaling laws that characterize the differing utility of various data subsets and accounting for how this diminishes for a data point at its nth repetition.

One-Step Diffusion Distillation via Deep Equilibrium Models

1 code implementation NeurIPS 2023 Zhengyang Geng, Ashwini Pokle, J. Zico Kolter

We demonstrate that the DEQ architecture is crucial to this capability, as GET matches a $5\times$ larger ViT in terms of FID scores while striking a critical balance of computational cost and image quality.

Deep Equilibrium Based Neural Operators for Steady-State PDEs

no code implementations NeurIPS 2023 Tanya Marwah, Ashwini Pokle, J. Zico Kolter, Zachary C. Lipton, Jianfeng Lu, Andrej Risteski

Motivated by this observation, we propose FNO-DEQ, a deep equilibrium variant of the FNO architecture that directly solves for the solution of a steady-state PDE as the infinite-depth fixed point of an implicit operator layer using a black-box root solver and differentiates analytically through this fixed point resulting in $\mathcal{O}(1)$ training memory.

Manifold Preserving Guided Diffusion

no code implementations28 Nov 2023 Yutong He, Naoki Murata, Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Dongjun Kim, Wei-Hsiang Liao, Yuki Mitsufuji, J. Zico Kolter, Ruslan Salakhutdinov, Stefano Ermon

Despite the recent advancements, conditional image generation still faces challenges of cost, generalizability, and the need for task-specific training.

Conditional Image Generation

Projected Off-Policy Q-Learning (POP-QL) for Stabilizing Offline Reinforcement Learning

no code implementations25 Nov 2023 Melrose Roderick, Gaurav Manek, Felix Berkenkamp, J. Zico Kolter

A key problem in off-policy Reinforcement Learning (RL) is the mismatch, or distribution shift, between the dataset and the distribution over states and actions visited by the learned policy.

Q-Learning Reinforcement Learning (RL)

TorchDEQ: A Library for Deep Equilibrium Models

1 code implementation28 Oct 2023 Zhengyang Geng, J. Zico Kolter

Deep Equilibrium (DEQ) Models, an emerging class of implicit models that maps inputs to fixed points of neural networks, are of growing interest in the deep learning community.

On the Neural Tangent Kernel of Equilibrium Models

no code implementations21 Oct 2023 Zhili Feng, J. Zico Kolter

This work studies the neural tangent kernel (NTK) of the deep equilibrium (DEQ) model, a practical ``infinite-depth'' architecture which directly computes the infinite-depth limit of a weight-tied network via root-finding.

Test-Time Adaptation Induces Stronger Accuracy and Agreement-on-the-Line

no code implementations7 Oct 2023 Eungyeup Kim, MingJie Sun, Christina Baek, aditi raghunathan, J. Zico Kolter

To analyze this, we revisit the theoretical conditions from Miller et al. (2021) that outline the types of distribution shifts needed for perfect ACL in linear models.

Model Selection Test-time Adaptation

Understanding prompt engineering may not require rethinking generalization

no code implementations6 Oct 2023 Victor Akinwande, Yiding Jiang, Dylan Sam, J. Zico Kolter

Zero-shot learning in prompted vision-language models, the practice of crafting prompts to build classifiers without an explicit training process, has achieved impressive performance in many settings.

Generalization Bounds Language Modelling +3

Representation Engineering: A Top-Down Approach to AI Transparency

4 code implementations2 Oct 2023 Andy Zou, Long Phan, Sarah Chen, James Campbell, Phillip Guo, Richard Ren, Alexander Pan, Xuwang Yin, Mantas Mazeika, Ann-Kathrin Dombrowski, Shashwat Goel, Nathaniel Li, Michael J. Byun, Zifan Wang, Alex Mallen, Steven Basart, Sanmi Koyejo, Dawn Song, Matt Fredrikson, J. Zico Kolter, Dan Hendrycks

In this paper, we identify and characterize the emerging area of representation engineering (RepE), an approach to enhancing the transparency of AI systems that draws on insights from cognitive neuroscience.

Question Answering

Universal and Transferable Adversarial Attacks on Aligned Language Models

23 code implementations27 Jul 2023 Andy Zou, Zifan Wang, Nicholas Carlini, Milad Nasr, J. Zico Kolter, Matt Fredrikson

Specifically, our approach finds a suffix that, when attached to a wide range of queries for an LLM to produce objectionable content, aims to maximize the probability that the model produces an affirmative response (rather than refusing to answer).

Adversarial Attack Ingenuity

Can Neural Network Memorization Be Localized?

1 code implementation18 Jul 2023 Pratyush Maini, Michael C. Mozer, Hanie Sedghi, Zachary C. Lipton, J. Zico Kolter, Chiyuan Zhang

Recent efforts at explaining the interplay of memorization and generalization in deep overparametrized networks have posited that neural networks $\textit{memorize}$ "hard" examples in the final few layers of the model.

Memorization

Monotone deep Boltzmann machines

no code implementations11 Jul 2023 Zhili Feng, Ezra Winston, J. Zico Kolter

Deep Boltzmann machines (DBMs), one of the first ``deep'' learning methods ever studied, are multi-layered probabilistic models governed by a pairwise energy function that describes the likelihood of all variables/nodes in the network.

Text Descriptions are Compressive and Invariant Representations for Visual Learning

no code implementations10 Jul 2023 Zhili Feng, Anna Bair, J. Zico Kolter

This method first automatically generates multiple visual descriptions of each class via a large language model (LLM), then uses a VLM to translate these descriptions to a set of visual feature embeddings of each image, and finally uses sparse logistic regression to select a relevant subset of these features to classify each image.

Descriptive Few-Shot Learning +5

T-MARS: Improving Visual Representations by Circumventing Text Feature Learning

1 code implementation6 Jul 2023 Pratyush Maini, Sachin Goyal, Zachary C. Lipton, J. Zico Kolter, aditi raghunathan

However, naively removing all such data could also be wasteful, as it throws away images that contain visual features (in addition to overlapping text).

Optical Character Recognition

Localized Text-to-Image Generation for Free via Cross Attention Control

no code implementations26 Jun 2023 Yutong He, Ruslan Salakhutdinov, J. Zico Kolter

Despite the tremendous success in text-to-image generative models, localized text-to-image generation (that is, generating objects or features at specific locations in an image while maintaining a consistent overall generation) still requires either explicit training or substantial additional inference time.

Semantic Segmentation Text-to-Image Generation

A Simple and Effective Pruning Approach for Large Language Models

6 code implementations20 Jun 2023 MingJie Sun, Zhuang Liu, Anna Bair, J. Zico Kolter

Motivated by the recent observation of emergent large magnitude features in LLMs, our approach prunes weights with the smallest magnitudes multiplied by the corresponding input activations, on a per-output basis.

Network Pruning

On the Joint Interaction of Models, Data, and Features

no code implementations7 Jun 2023 Yiding Jiang, Christina Baek, J. Zico Kolter

Thus, we believe this work provides valuable new insight into our understanding of feature learning.

Mimetic Initialization of Self-Attention Layers

no code implementations16 May 2023 Asher Trockman, J. Zico Kolter

It is notoriously difficult to train Transformers on small datasets; typically, large pre-trained models are instead used as the starting point.

The Update-Equivalence Framework for Decision-Time Planning

no code implementations25 Apr 2023 Samuel Sokota, Gabriele Farina, David J. Wu, Hengyuan Hu, Kevin A. Wang, J. Zico Kolter, Noam Brown

Using this framework, we derive a provably sound search algorithm for fully cooperative games based on mirror descent and a search algorithm for adversarial games based on magnetic mirror descent.

Learning with Explanation Constraints

no code implementations NeurIPS 2023 Rattana Pukdee, Dylan Sam, J. Zico Kolter, Maria-Florina Balcan, Pradeep Ravikumar

In this paper, we formalize this notion as learning from explanation constraints and provide a learning theoretic framework to analyze how such explanations can improve the learning of our models.

Sinkhorn-Flow: Predicting Probability Mass Flow in Dynamical Systems Using Optimal Transport

no code implementations14 Mar 2023 Mukul Bhutani, J. Zico Kolter

Predicting how distributions over discrete variables vary over time is a common task in time series forecasting.

Time Series Time Series Forecasting

Model-tuning Via Prompts Makes NLP Models Adversarially Robust

1 code implementation13 Mar 2023 Mrigank Raman, Pratyush Maini, J. Zico Kolter, Zachary C. Lipton, Danish Pruthi

Across 5 NLP datasets, 4 adversarial attacks, and 3 different models, MVP improves performance against adversarial substitutions by an average of 8% over standard methods and even outperforms adversarial training-based state-of-art defenses by 3. 5%.

Adversarial Robustness Language Modelling +1

Single Image Backdoor Inversion via Robust Smoothed Classifiers

1 code implementation CVPR 2023 MingJie Sun, J. Zico Kolter

Insipired by recent advances in adversarial robustness, our method SmoothInv starts from a single clean image, and then performs projected gradient descent towards the target class on a robust smoothed version of the original backdoored classifier.

Adversarial Robustness Image Generation

Permutation Equivariant Neural Functionals

2 code implementations NeurIPS 2023 Allan Zhou, KaiEn Yang, Kaylee Burns, Adriano Cardace, Yiding Jiang, Samuel Sokota, J. Zico Kolter, Chelsea Finn

The key building blocks of this framework are NF-Layers (neural functional layers) that we constrain to be permutation equivariant through an appropriate parameter sharing scheme.

Inductive Bias

Abstracting Imperfect Information Away from Two-Player Zero-Sum Games

no code implementations22 Jan 2023 Samuel Sokota, Ryan D'Orazio, Chun Kai Ling, David J. Wu, J. Zico Kolter, Noam Brown

Because these regularized equilibria can be made arbitrarily close to Nash equilibria, our result opens the door to a new perspective to solving two-player zero-sum games and yields a simplified framework for decision-time planning in two-player zero-sum games, void of the unappealing properties that plague existing decision-time planning approaches.

Vocal Bursts Valence Prediction

Function Approximation for Solving Stackelberg Equilibrium in Large Perfect Information Games

1 code implementation29 Dec 2022 Chun Kai Ling, J. Zico Kolter, Fei Fang

Function approximation (FA) has been a critical component in solving large zero-sum games.

Losses over Labels: Weakly Supervised Learning via Direct Loss Construction

1 code implementation13 Dec 2022 Dylan Sam, J. Zico Kolter

Owing to the prohibitive costs of generating large amounts of labeled data, programmatic weak supervision is a growing paradigm within machine learning.

feature selection Image Classification +1

Simple initialization and parametrization of sinusoidal networks via their kernel bandwidth

no code implementations26 Nov 2022 Filipe de Avila Belbute-Peres, J. Zico Kolter

Neural networks with sinusoidal activations have been proposed as an alternative to networks with traditional activation functions.

Characterizing Datapoints via Second-Split Forgetting

1 code implementation26 Oct 2022 Pratyush Maini, Saurabh Garg, Zachary C. Lipton, J. Zico Kolter

Popular metrics derived from these dynamics include (i) the epoch at which examples are first correctly classified; (ii) the number of times their predictions flip during training; and (iii) whether their prediction flips if they are held out.

Perfectly Secure Steganography Using Minimum Entropy Coupling

2 code implementations24 Oct 2022 Christian Schroeder de Witt, Samuel Sokota, J. Zico Kolter, Jakob Foerster, Martin Strohmeier

Steganography is the practice of encoding secret information into innocuous content in such a manner that an adversarial third party would not realize that there is hidden meaning.

Understanding the Covariance Structure of Convolutional Filters

no code implementations7 Oct 2022 Asher Trockman, Devin Willmott, J. Zico Kolter

In this work, we first observe that such learned filters have highly-structured covariance matrices, and moreover, we find that covariances calculated from small networks may be used to effectively initialize a variety of larger networks of different depths, widths, patch sizes, and kernel sizes, indicating a degree of model-independence to the covariance structure.

General Cutting Planes for Bound-Propagation-Based Neural Network Verification

3 code implementations11 Aug 2022 huan zhang, Shiqi Wang, Kaidi Xu, Linyi Li, Bo Li, Suman Jana, Cho-Jui Hsieh, J. Zico Kolter

Our generalized bound propagation method, GCP-CROWN, opens up the opportunity to apply general cutting plane methods for neural network verification while benefiting from the efficiency and GPU acceleration of bound propagation methods.

(Certified!!) Adversarial Robustness for Free!

2 code implementations21 Jun 2022 Nicholas Carlini, Florian Tramer, Krishnamurthy Dj Dvijotham, Leslie Rice, MingJie Sun, J. Zico Kolter

In this paper we show how to achieve state-of-the-art certified adversarial robustness to 2-norm bounded perturbations by relying exclusively on off-the-shelf pretrained models.

Adversarial Robustness Denoising

Smooth-Reduce: Leveraging Patches for Improved Certified Robustness

no code implementations12 May 2022 Ameya Joshi, Minh Pham, Minsu Cho, Leonid Boytsov, Filipe Condessa, J. Zico Kolter, Chinmay Hegde

Randomized smoothing (RS) has been shown to be a fast, scalable technique for certifying the robustness of deep neural network classifiers.

Deep Equilibrium Optical Flow Estimation

1 code implementation CVPR 2022 Shaojie Bai, Zhengyang Geng, Yash Savani, J. Zico Kolter

Many recent state-of-the-art (SOTA) optical flow models use finite-step recurrent update operations to emulate traditional algorithms by encouraging iterative refinements toward a stable flow estimation.

Optical Flow Estimation

Patches Are All You Need?

11 code implementations24 Jan 2022 Asher Trockman, J. Zico Kolter

Despite its simplicity, we show that the ConvMixer outperforms the ViT, MLP-Mixer, and some of their variants for similar parameter counts and data set sizes, in addition to outperforming classical vision models such as the ResNet.

Image Classification

Monte Carlo Tree Search With Iteratively Refining State Abstractions

no code implementations NeurIPS 2021 Samuel Sokota, Caleb Ho, Zaheen Ahmad, J. Zico Kolter

In this work, we present a method, called abstraction refining, for extending MCTS to stochastic environments which, unlike progressive widening, leverages the geometry of the state space.

Robustness between the worst and average case

no code implementations NeurIPS 2021 Leslie Rice, Anna Bair, huan zhang, J. Zico Kolter

Several recent works in machine learning have focused on evaluating the test-time robustness of a classifier: how well the classifier performs not just on the target domain it was trained upon, but upon perturbed examples.

Adversarial Robustness

$(\textrm{Implicit})^2$: Implicit Layers for Implicit Representations

no code implementations NeurIPS 2021 Zhichun Huang, Shaojie Bai, J. Zico Kolter

Recent research in deep learning has investigated two very different forms of ''implicitness'': implicit representations model high-frequency data such as images or 3D shapes directly via a low-dimensional neural network (often using e. g., sinusoidal bases or nonlinearities); implicit layers, in contrast, refer to techniques where the forward pass of a network is computed via non-linear dynamical systems, such as fixed-point or differential equation solutions, with the backward pass computed via the implicit function theorem.

Joint inference and input optimization in equilibrium networks

1 code implementation NeurIPS 2021 Swaminathan Gurumurthy, Shaojie Bai, Zachary Manchester, J. Zico Kolter

Many tasks in deep learning involve optimizing over the \emph{inputs} to a network to minimize or maximize some objective; examples include optimization over latent spaces in a generative model to match a target image, or adversarially perturbing an input to worsen classifier performance.

Denoising Meta-Learning

Adversarially Robust Learning for Security-Constrained Optimal Power Flow

no code implementations NeurIPS 2021 Priya L. Donti, Aayushya Agarwal, Neeraj Vijay Bedmutha, Larry Pileggi, J. Zico Kolter

In recent years, the ML community has seen surges of interest in both adversarially robust learning and implicit layers, but connections between these two areas have seldom been explored.

Communicating via Markov Decision Processes

1 code implementation17 Jul 2021 Samuel Sokota, Christian Schroeder de Witt, Maximilian Igl, Luisa Zintgraf, Philip Torr, Martin Strohmeier, J. Zico Kolter, Shimon Whiteson, Jakob Foerster

We contribute a theoretically grounded approach to MCGs based on maximum entropy reinforcement learning and minimum entropy coupling that we call MEME.

Multi-agent Reinforcement Learning

Stabilizing Equilibrium Models by Jacobian Regularization

1 code implementation28 Jun 2021 Shaojie Bai, Vladlen Koltun, J. Zico Kolter

Deep equilibrium networks (DEQs) are a new class of models that eschews traditional depth in favor of finding the fixed point of a single nonlinear layer.

Language Modelling

Assessing Generalization of SGD via Disagreement

no code implementations ICLR 2022 Yiding Jiang, Vaishnavh Nagarajan, Christina Baek, J. Zico Kolter

We empirically show that the test error of deep networks can be estimated by simply training the same architecture on the same training set but with a different run of Stochastic Gradient Descent (SGD), and measuring the disagreement rate between the two networks on unlabeled test data.

DeepSplit: Scalable Verification of Deep Neural Networks via Operator Splitting

1 code implementation16 Jun 2021 Shaoru Chen, Eric Wong, J. Zico Kolter, Mahyar Fazlyab

Analyzing the worst-case performance of deep neural networks against input perturbations amounts to solving a large-scale non-convex optimization problem, for which several past works have proposed convex relaxations as a promising alternative.

Image Classification

DORO: Distributional and Outlier Robust Optimization

1 code implementation11 Jun 2021 Runtian Zhai, Chen Dan, J. Zico Kolter, Pradeep Ravikumar

Many machine learning tasks involve subpopulation shift where the testing data distribution is a subpopulation of the training distribution.

Open-Ended Question Answering

Enforcing Policy Feasibility Constraints through Differentiable Projection for Energy Optimization

1 code implementation19 May 2021 Bingqing Chen, Priya Donti, Kyri Baker, J. Zico Kolter, Mario Berges

Specifically, we incorporate a differentiable projection layer within a neural network-based policy to enforce that all learned actions are feasible.

Reinforcement Learning (RL)

RATT: Leveraging Unlabeled Data to Guarantee Generalization

1 code implementation1 May 2021 Saurabh Garg, Sivaraman Balakrishnan, J. Zico Kolter, Zachary C. Lipton

To assess generalization, machine learning scientists typically either (i) bound the generalization gap and then (after training) plug in the empirical risk to obtain a bound on the true risk; or (ii) validate empirically on holdout data.

Generalization Bounds Holdout Set +1

DC3: A learning method for optimization with hard constraints

1 code implementation ICLR 2021 Priya L. Donti, David Rolnick, J. Zico Kolter

Large optimization problems with hard constraints arise in many settings, yet classical solvers are often prohibitively slow, motivating the use of deep networks as cheap "approximate solvers."

Orthogonalizing Convolutional Layers with the Cayley Transform

1 code implementation ICLR 2021 Asher Trockman, J. Zico Kolter

Recent work has highlighted several advantages of enforcing orthogonality in the weight layers of deep networks, such as maintaining the stability of activations, preserving gradient norms, and enhancing adversarial robustness by enforcing low Lipschitz constants.

Adversarial Robustness

Beta-CROWN: Efficient Bound Propagation with Per-neuron Split Constraints for Complete and Incomplete Neural Network Robustness Verification

5 code implementations NeurIPS 2021 Shiqi Wang, huan zhang, Kaidi Xu, Xue Lin, Suman Jana, Cho-Jui Hsieh, J. Zico Kolter

Compared to the typically tightest but very costly semidefinite programming (SDP) based incomplete verifiers, we obtain higher verified accuracy with three orders of magnitudes less verification time.

Adversarial Attack

Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability

1 code implementation ICLR 2021 Jeremy M. Cohen, Simran Kaur, Yuanzhi Li, J. Zico Kolter, Ameet Talwalkar

We empirically demonstrate that full-batch gradient descent on neural network training objectives typically operates in a regime we call the Edge of Stability.

Deep Archimedean Copulas

1 code implementation NeurIPS 2020 Chun Kai Ling, Fei Fang, J. Zico Kolter

A central problem in machine learning and statistics is to model joint densities of random variables from data.

Efficient semidefinite-programming-based inference for binary and multi-class MRFs

1 code implementation NeurIPS 2020 Chirag Pabbaraju, Po-Wei Wang, J. Zico Kolter

Probabilistic inference in pairwise Markov Random Fields (MRFs), i. e. computing the partition function or computing a MAP estimate of the variables, is a foundational problem in probabilistic graphical models.

Community detection using fast low-cardinality semidefinite programming

1 code implementation NeurIPS 2020 Po-Wei Wang, J. Zico Kolter

Modularity maximization has been a fundamental tool for understanding the community structure of a network, but the underlying optimization problem is nonconvex and NP-hard to solve.

Community Detection

Challenging common interpretability assumptions in feature attribution explanations

1 code implementation4 Dec 2020 Jonathan Dinu, Jeffrey Bigham, J. Zico Kolter

As machine learning and algorithmic decision making systems are increasingly being leveraged in high-stakes human-in-the-loop settings, there is a pressing need to understand the rationale of their predictions.

Decision Making Explainable Artificial Intelligence (XAI) +1

Enforcing robust control guarantees within neural network policies

1 code implementation ICLR 2021 Priya L. Donti, Melrose Roderick, Mahyar Fazlyab, J. Zico Kolter

When designing controllers for safety-critical systems, practitioners often face a challenging tradeoff between robustness and performance.

Poisoned classifiers are not only backdoored, they are fundamentally broken

1 code implementation18 Oct 2020 MingJie Sun, Siddhant Agarwal, J. Zico Kolter

Under this threat model, we propose a test-time, human-in-the-loop attack method to generate multiple effective alternative triggers without access to the initial backdoor and the training data.

Gaussian MRF Covariance Modeling for Efficient Black-Box Adversarial Attacks

1 code implementation8 Oct 2020 Anit Kumar Sahu, Satya Narayan Shukla, J. Zico Kolter

We study the problem of generating adversarial examples in a black-box setting, where we only have access to a zeroth order oracle, providing us with loss function evaluations.

Learning perturbation sets for robust machine learning

1 code implementation ICLR 2021 Eric Wong, J. Zico Kolter

In this paper, we aim to bridge this gap by learning perturbation sets from data, in order to characterize real-world effects for robust training and evaluation.

BIG-bench Machine Learning

Simple and Efficient Hard Label Black-box Adversarial Attacks in Low Query Budget Regimes

1 code implementation13 Jul 2020 Satya Narayan Shukla, Anit Kumar Sahu, Devin Willmott, J. Zico Kolter

We focus on the problem of black-box adversarial attacks, where the aim is to generate adversarial examples for deep learning models solely based on information limited to output label~(hard label) to a queried data input.

Bayesian Optimization

Combining Differentiable PDE Solvers and Graph Neural Networks for Fluid Flow Prediction

2 code implementations ICML 2020 Filipe de Avila Belbute-Peres, Thomas D. Economon, J. Zico Kolter

Solving large complex partial differential equations (PDEs), such as those that arise in computational fluid dynamics (CFD), is a computationally expensive process.

Graph Neural Network

Provably Safe PAC-MDP Exploration Using Analogies

1 code implementation7 Jul 2020 Melrose Roderick, Vaishnavh Nagarajan, J. Zico Kolter

A key challenge in applying reinforcement learning to safety-critical domains is understanding how to balance exploration (needed to attain good performance on the task) with safety (needed to avoid catastrophic failure).

reinforcement-learning Reinforcement Learning +2

Neural Network Virtual Sensors for Fuel Injection Quantities with Provable Performance Specifications

no code implementations30 Jun 2020 Eric Wong, Tim Schneider, Joerg Schmitt, Frank R. Schmidt, J. Zico Kolter

Additionally, we show how specific intervals of fuel injection quantities can be targeted to maximize robustness for certain ranges, allowing us to train a virtual sensor for fuel injection which is provably guaranteed to have at most 10. 69% relative error under noise while maintaining 3% relative error on non-adversarial data within normalized fuel injection ranges of 0. 6 to 1. 0.

Monotone operator equilibrium networks

1 code implementation NeurIPS 2020 Ezra Winston, J. Zico Kolter

We then develop a parameterization of the network which ensures that all operators remain monotone, which guarantees the existence of a unique equilibrium point.

Multiscale Deep Equilibrium Models

4 code implementations NeurIPS 2020 Shaojie Bai, Vladlen Koltun, J. Zico Kolter

These simultaneously-learned multi-resolution features allow us to train a single model on a diverse set of tasks and loss functions, such as using a single MDEQ to perform both image classification and semantic segmentation.

General Classification Image Classification +2

Differentiable learning of numerical rules in knowledge graphs

no code implementations ICLR 2020 Po-Wei Wang, Daria Stepanova, Csaba Domokos, J. Zico Kolter

Rules over a knowledge graph (KG) capture interpretable patterns in data and can be used for KG cleaning and completion.

Knowledge Graphs

Overfitting in adversarially robust deep learning

4 code implementations ICML 2020 Leslie Rice, Eric Wong, J. Zico Kolter

Based upon this observed effect, we show that the performance gains of virtually all recent algorithmic improvements upon adversarial training can be matched by simply using early stopping.

Data Augmentation Deep Learning

Certified Robustness to Label-Flipping Attacks via Randomized Smoothing

no code implementations ICML 2020 Elan Rosenfeld, Ezra Winston, Pradeep Ravikumar, J. Zico Kolter

Machine learning algorithms are known to be susceptible to data poisoning attacks, where an adversary manipulates the training data to degrade performance of the resulting classifier.

Data Poisoning General Classification +1

Learning Stable Deep Dynamics Models

1 code implementation NeurIPS 2019 Gaurav Manek, J. Zico Kolter

Deep networks are commonly used to model dynamical systems, predicting how the state of a system will evolve over time (either autonomously or in response to control inputs).

Fast is better than free: Revisiting adversarial training

11 code implementations ICLR 2020 Eric Wong, Leslie Rice, J. Zico Kolter

Furthermore we show that FGSM adversarial training can be further accelerated by using standard techniques for efficient training of deep networks, allowing us to learn a robust CIFAR10 classifier with 45% robust accuracy to PGD attacks with $\epsilon=8/255$ in 6 minutes, and a robust ImageNet classifier with 43% robust accuracy at $\epsilon=2/255$ in 12 hours, in comparison to past work based on "free" adversarial training which took 10 and 50 hours to reach the same respective thresholds.

AP-Perf: Incorporating Generic Performance Metrics in Differentiable Learning

4 code implementations2 Dec 2019 Rizal Fathony, J. Zico Kolter

We propose a method that enables practitioners to conveniently incorporate custom non-decomposable performance metrics into differentiable learning pipelines, notably those based upon neural network architectures.

General Classification Image Classification

Dynamic Modeling and Equilibria in Fair Decision Making

no code implementations15 Nov 2019 Joshua Williams, J. Zico Kolter

Recent studies on fairness in automated decision making systems have both investigated the potential future impact of these decisions on the population at large, and emphasized that imposing ''typical'' fairness constraints such as demographic parity or equality of opportunity does not guarantee a benefit to disadvantaged groups.

Decision Making Fairness

Adversarial Music: Real World Audio Adversary Against Wake-word Detection System

no code implementations NeurIPS 2019 Juncheng B. Li, Shuhui Qu, Xinjian Li, Joseph Szurley, J. Zico Kolter, Florian Metze

In this work, we target our attack on the wake-word detection system, jamming the model with some inconspicuous background music to deactivate the VAs while our audio adversary is present.

Real-World Adversarial Attack

Black-box Adversarial Attacks with Bayesian Optimization

1 code implementation30 Sep 2019 Satya Narayan Shukla, Anit Kumar Sahu, Devin Willmott, J. Zico Kolter

We focus on the problem of black-box adversarial attacks, where the aim is to generate adversarial examples using information limited to loss function evaluations of input-output pairs.

Bayesian Optimization

Certified Robustness to Adversarial Label-Flipping Attacks via Randomized Smoothing

no code implementations25 Sep 2019 Elan Rosenfeld, Ezra Winston, Pradeep Ravikumar, J. Zico Kolter

This paper considers label-flipping attacks, a type of data poisoning attack where an adversary relabels a small number of examples in a training set in order to degrade the performance of the resulting classifier.

Binary Classification Data Poisoning

Adversarial Robustness Against the Union of Multiple Perturbation Models

1 code implementation9 Sep 2019 Pratyush Maini, Eric Wong, J. Zico Kolter

Owing to the susceptibility of deep learning systems to adversarial attacks, there has been a great deal of work in developing (both empirically and certifiably) robust classifiers.

Adversarial Robustness

The Limited Multi-Label Projection Layer

1 code implementation20 Jun 2019 Brandon Amos, Vladlen Koltun, J. Zico Kolter

We propose the Limited Multi-Label (LML) projection layer as a new primitive operation for end-to-end learning systems.

General Classification Graph Generation +1

Perceptual Based Adversarial Audio Attacks

no code implementations14 Jun 2019 Joseph Szurley, J. Zico Kolter

Recent work has shown the possibility of adversarial attacks on automatic speechrecognition (ASR) systems.

Audio and Speech Processing Sound

Deterministic PAC-Bayesian generalization bounds for deep networks via generalizing noise-resilience

no code implementations ICLR 2019 Vaishnavh Nagarajan, J. Zico Kolter

The ability of overparameterized deep networks to generalize well has been linked to the fact that stochastic gradient descent (SGD) finds solutions that lie in flat, wide minima in the training loss -- minima where the output of the network is resilient to small random noise added to its parameters.

Generalization Bounds

Adversarial camera stickers: A physical camera-based attack on deep learning systems

1 code implementation21 Mar 2019 Juncheng Li, Frank R. Schmidt, J. Zico Kolter

In this work, we consider an alternative question: is it possible to fool deep classifiers, over all perceived objects of a certain type, by physically manipulating the camera itself?

Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games

no code implementations11 Mar 2019 Chun Kai Ling, Fei Fang, J. Zico Kolter

With the recent advances in solving large, zero-sum extensive form games, there is a growing interest in the inverse problem of inferring underlying game parameters given only access to agent actions.

Wasserstein Adversarial Examples via Projected Sinkhorn Iterations

2 code implementations21 Feb 2019 Eric Wong, Frank R. Schmidt, J. Zico Kolter

In this paper, we propose a new threat model for adversarial attacks based on the Wasserstein distance.

Adversarial Attack Adversarial Defense +4

Uniform convergence may be unable to explain generalization in deep learning

1 code implementation NeurIPS 2019 Vaishnavh Nagarajan, J. Zico Kolter

Aimed at explaining the surprisingly good generalization behavior of overparameterized deep networks, recent works have developed a variety of generalization bounds for deep learning, all based on the fundamental learning-theoretic technique of uniform convergence.

Deep Learning Generalization Bounds

Certified Adversarial Robustness via Randomized Smoothing

11 code implementations8 Feb 2019 Jeremy M Cohen, Elan Rosenfeld, J. Zico Kolter

We show how to turn any classifier that classifies well under Gaussian noise into a new classifier that is certifiably robust to adversarial perturbations under the $\ell_2$ norm.

Adversarial Defense Adversarial Robustness +1

Generalization in Deep Networks: The Role of Distance from Initialization

no code implementations7 Jan 2019 Vaishnavh Nagarajan, J. Zico Kolter

Why does training deep neural networks using stochastic gradient descent (SGD) result in a generalization error that does not worsen with the number of parameters in the network?

Low-rank semidefinite programming for the MAX2SAT problem

1 code implementation15 Dec 2018 Po-Wei Wang, J. Zico Kolter

This paper proposes a new algorithm for solving MAX2SAT problems based on combining search methods with semidefinite programming approaches.

End-to-End Differentiable Physics for Learning and Control

1 code implementation NeurIPS 2018 Filipe de Avila Belbute-Peres, Kevin Smith, Kelsey Allen, Josh Tenenbaum, J. Zico Kolter

We present a differentiable physics engine that can be integrated as a module in deep neural networks for end-to-end learning.

Differentiable MPC for End-to-end Planning and Control

2 code implementations NeurIPS 2018 Brandon Amos, Ivan Dario Jimenez Rodriguez, Jacob Sacks, Byron Boots, J. Zico Kolter

We present foundations for using Model Predictive Control (MPC) as a differentiable policy class for reinforcement learning in continuous state and action spaces.

Imitation Learning Model Predictive Control +1

A Continuous-Time View of Early Stopping for Least Squares

no code implementations23 Oct 2018 Alnur Ali, J. Zico Kolter, Ryan J. Tibshirani

Our primary focus is to compare the risk of gradient flow to that of ridge regression.

regression

Trellis Networks for Sequence Modeling

1 code implementation ICLR 2019 Shaojie Bai, J. Zico Kolter, Vladlen Koltun

On the other hand, we show that truncated recurrent networks are equivalent to trellis networks with special sparsity structure in their weight matrices.

Language Modelling Sequential Image Classification

Scaling provable adversarial defenses

4 code implementations NeurIPS 2018 Eric Wong, Frank R. Schmidt, Jan Hendrik Metzen, J. Zico Kolter

Recent work has developed methods for learning deep network classifiers that are provably robust to norm-bounded adversarial perturbation; however, these methods are currently only possible for relatively small feedforward networks.

What game are we playing? End-to-end learning in normal and extensive form games

1 code implementation7 May 2018 Chun Kai Ling, Fei Fang, J. Zico Kolter

Although recent work in AI has made great progress in solving large, zero-sum, extensive-form games, the underlying assumption in most past work is that the parameters of the game itself are known to the agents.

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

33 code implementations4 Mar 2018 Shaojie Bai, J. Zico Kolter, Vladlen Koltun

Our results indicate that a simple convolutional architecture outperforms canonical recurrent networks such as LSTMs across a diverse range of tasks and datasets, while demonstrating longer effective memory.

Audio Synthesis Language Modelling +5

Realtime query completion via deep language models

no code implementations ICLR 2018 Po-Wei Wang, J. Zico Kolter, Vijai Mohan, Inderjit S. Dhillon

Search engine users nowadays heavily depend on query completion and correction to shape their queries.

Language Modelling

Provable defenses against adversarial examples via the convex outer adversarial polytope

8 code implementations ICML 2018 Eric Wong, J. Zico Kolter

We propose a method to learn deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations on the training data.

Adversarial Attack

Gradient descent GAN optimization is locally stable

1 code implementation NeurIPS 2017 Vaishnavh Nagarajan, J. Zico Kolter

Despite the growing prominence of generative adversarial networks (GANs), optimization in GANs is still a poorly understood topic.

The Mixing method: low-rank coordinate descent for semidefinite programming with diagonal constraints

1 code implementation1 Jun 2017 Po-Wei Wang, Wei-Cheng Chang, J. Zico Kolter

In this paper, we propose a low-rank coordinate descent approach to structured semidefinite programming with diagonal constraints.

Learning Word Embeddings

Task-based End-to-end Model Learning in Stochastic Optimization

1 code implementation NeurIPS 2017 Priya L. Donti, Brandon Amos, J. Zico Kolter

With the increasing popularity of machine learning techniques, it has become common to see prediction algorithms operating within some larger process.

BIG-bench Machine Learning Scheduling +1

OptNet: Differentiable Optimization as a Layer in Neural Networks

7 code implementations ICML 2017 Brandon Amos, J. Zico Kolter

This paper presents OptNet, a network architecture that integrates optimization problems (here, specifically in the form of quadratic programs) as individual layers in larger end-to-end trainable deep networks.

Bilevel Optimization

Input Convex Neural Networks

3 code implementations ICML 2017 Brandon Amos, Lei Xu, J. Zico Kolter

We show that many existing neural network architectures can be made input-convex with a minor modification, and develop specialized optimization algorithms tailored to this setting.

Imputation Inference Optimization +4

Probabilistic Segmentation via Total Variation Regularization

no code implementations16 Nov 2015 Matt Wytock, J. Zico Kolter

We present a convex approach to probabilistic segmentation and modeling of time series data.

Density Estimation Segmentation +2

Contextually Supervised Source Separation with Application to Energy Disaggregation

no code implementations18 Dec 2013 Matt Wytock, J. Zico Kolter

We propose a new framework for single-channel source separation that lies between the fully supervised and unsupervised setting.

Cannot find the paper you are looking for? You can Submit a new open access paper.