no code implementations • 9 Dec 2024 • Yash Savani, Marc Finzi, J. Zico Kolter
We introduce a novel, training-free method for sampling differentiable representations (diffreps) using pretrained diffusion models.
no code implementations • 30 Nov 2024 • Michail Dontas, Yutong He, Naoki Murata, Yuki Mitsufuji, J. Zico Kolter, Ruslan Salakhutdinov
Blind inverse problems, where both the target data and forward operator are unknown, are crucial to many computer vision applications.
1 code implementation • 5 Nov 2024 • Kevin Y. Li, Sachin Goyal, Joao D. Semedo, J. Zico Kolter
Our results reveal a surprising trend: for visual reasoning tasks, the inference-optimal behavior in VLMs, i. e., minimum downstream error at any given fixed inference compute, is achieved when using the largest LLM that fits within the inference budget while minimizing visual token count - often to a single token.
1 code implementation • 22 Oct 2024 • Weijian Luo, Zemin Huang, Zhengyang Geng, J. Zico Kolter, Guo-Jun Qi
In this paper, we present Score Implicit Matching (SIM) a new approach to distilling pre-trained diffusion models into single-step generator models, while maintaining almost the same sample generation ability as the original model as well as being data-free with no need of training samples for distillation.
no code implementations • 18 Oct 2024 • Joshua Nathaniel Williams, Anurag Katakkar, Hoda Heidari, J. Zico Kolter
Counterfactual explanations have been a popular method of post-hoc explainability for a variety of settings in Machine Learning.
1 code implementation • 15 Oct 2024 • Yiding Jiang, Allan Zhou, Zhili Feng, Sadhika Malladi, J. Zico Kolter
The composition of pretraining data is a key determinant of foundation models' performance, but there is no standard guideline for allocating a limited computational budget across different data sources.
no code implementations • 14 Oct 2024 • Asher Trockman, Hrayr Harutyunyan, J. Zico Kolter, Sanjiv Kumar, Srinadh Bhojanapalli
Recent work has shown that state space models such as Mamba are significantly worse than Transformers on recall-based tasks due to the fact that their state size is constant with respect to their input sequence length.
1 code implementation • 14 Oct 2024 • Sachin Goyal, Christina Baek, J. Zico Kolter, aditi raghunathan
However, models struggle to reliably follow the input context, especially when it conflicts with their parametric knowledge from pretraining.
1 code implementation • 15 Sep 2024 • Dylan Sam, Devin Willmott, Joao D. Semedo, J. Zico Kolter
A notable drawback of CLIP, however, is that the resulting embedding space seems to lack some of the structure of their purely text-based alternatives.
no code implementations • 19 Aug 2024 • Aviv Bick, Kevin Y. Li, Eric P. Xing, J. Zico Kolter, Albert Gu
In this work, we present a method that is able to distill a pretrained Transformer architecture into alternative architectures such as state space models (SSMs).
no code implementations • 12 Aug 2024 • Joshua Nathaniel Williams, Avi Schwarzschild, J. Zico Kolter
Recovering natural language prompts for image generation models, solely based on the generated images is a difficult discrete optimization problem.
1 code implementation • 9 Aug 2024 • Joshua Nathaniel Williams, J. Zico Kolter
The widespread use of large language models has resulted in a multitude of tokenizers and embedding spaces, making knowledge transfer in prompt discovery tasks difficult.
1 code implementation • 20 Jun 2024 • Zhengyang Geng, Ashwini Pokle, William Luo, Justin Lin, J. Zico Kolter
For example, as of 2024, training a state-of-the-art CM on CIFAR-10 takes one week on 8 GPUs.
Ranked #11 on Image Generation on ImageNet 64x64
1 code implementation • 13 Jun 2024 • Sumukh K Aithal, Pratyush Maini, Zachary C. Lipton, J. Zico Kolter
Specifically, we find that diffusion models smoothly "interpolate" between nearby data modes in the training set, to generate samples that are completely outside the support of the original training distribution; this phenomenon leads diffusion models to generate artifacts that never existed in real data (i. e., hallucinations).
no code implementations • 23 Apr 2024 • Avi Schwarzschild, Zhili Feng, Pratyush Maini, Zachary C. Lipton, J. Zico Kolter
The ACR overcomes the limitations of existing notions of memorization by (i) offering an adversarial view of measuring memorization, especially for monitoring unlearning and compliance; and (ii) allowing for the flexibility to measure memorization for arbitrary strings at a reasonably low compute.
1 code implementation • 10 Apr 2024 • Sachin Goyal, Pratyush Maini, Zachary C. Lipton, aditi raghunathan, J. Zico Kolter
Vision-language models (VLMs) are trained for thousands of GPU hours on carefully curated web datasets.
no code implementations • 28 Mar 2024 • Yutong He, Alexander Robey, Naoki Murata, Yiding Jiang, Joshua Nathaniel Williams, George J. Pappas, Hamed Hassani, Yuki Mitsufuji, Ruslan Salakhutdinov, J. Zico Kolter
Prompt engineering is effective for controlling the output of text-to-image (T2I) generative models, but it is also laborious due to the need for manually crafted prompts.
1 code implementation • 6 Mar 2024 • Victor Akinwande, J. Zico Kolter
Existing causal discovery methods based on combinatorial optimization or search are slow, prohibiting their application on large-scale datasets.
4 code implementations • 27 Feb 2024 • MingJie Sun, Xinlei Chen, J. Zico Kolter, Zhuang Liu
We observe an empirical phenomenon in Large Language Models (LLMs) -- very few activations exhibit significantly larger values than others (e. g., 100, 000 times larger).
no code implementations • 20 Feb 2024 • Dylan Sam, Rattana Pukdee, Daniel P. Jeong, Yewon Byun, J. Zico Kolter
Bayesian neural networks (BNNs) have recently gained popularity due to their ability to quantify model uncertainty.
no code implementations • 12 Jan 2024 • Zhili Feng, Michal Moshkovitz, Dotan Di Castro, J. Zico Kolter
Concept explanation is a popular approach for examining how human-interpretable concepts impact the predictions of a model.
3 code implementations • 11 Jan 2024 • Pratyush Maini, Zhili Feng, Avi Schwarzschild, Zachary C. Lipton, J. Zico Kolter
Large language models trained on massive corpora of data from the web can memorize and reproduce sensitive or private data raising both legal and ethical concerns.
no code implementations • CVPR 2024 • Sachin Goyal, Pratyush Maini, Zachary C. Lipton, aditi raghunathan, J. Zico Kolter
Our work bridges this important gap in the literature by developing scaling laws that characterize the differing utility of various data subsets and accounting for how this diminishes for a data point at its nth repetition.
1 code implementation • NeurIPS 2023 • Zhengyang Geng, Ashwini Pokle, J. Zico Kolter
We demonstrate that the DEQ architecture is crucial to this capability, as GET matches a $5\times$ larger ViT in terms of FID scores while striking a critical balance of computational cost and image quality.
no code implementations • NeurIPS 2023 • Tanya Marwah, Ashwini Pokle, J. Zico Kolter, Zachary C. Lipton, Jianfeng Lu, Andrej Risteski
Motivated by this observation, we propose FNO-DEQ, a deep equilibrium variant of the FNO architecture that directly solves for the solution of a steady-state PDE as the infinite-depth fixed point of an implicit operator layer using a black-box root solver and differentiates analytically through this fixed point resulting in $\mathcal{O}(1)$ training memory.
no code implementations • 28 Nov 2023 • Yutong He, Naoki Murata, Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Dongjun Kim, Wei-Hsiang Liao, Yuki Mitsufuji, J. Zico Kolter, Ruslan Salakhutdinov, Stefano Ermon
Despite the recent advancements, conditional image generation still faces challenges of cost, generalizability, and the need for task-specific training.
no code implementations • 25 Nov 2023 • Melrose Roderick, Gaurav Manek, Felix Berkenkamp, J. Zico Kolter
A key problem in off-policy Reinforcement Learning (RL) is the mismatch, or distribution shift, between the dataset and the distribution over states and actions visited by the learned policy.
1 code implementation • 28 Oct 2023 • Zhengyang Geng, J. Zico Kolter
Deep Equilibrium (DEQ) Models, an emerging class of implicit models that maps inputs to fixed points of neural networks, are of growing interest in the deep learning community.
no code implementations • 21 Oct 2023 • Zhili Feng, J. Zico Kolter
This work studies the neural tangent kernel (NTK) of the deep equilibrium (DEQ) model, a practical ``infinite-depth'' architecture which directly computes the infinite-depth limit of a weight-tied network via root-finding.
no code implementations • 7 Oct 2023 • Eungyeup Kim, MingJie Sun, Christina Baek, aditi raghunathan, J. Zico Kolter
To analyze this, we revisit the theoretical conditions from Miller et al. (2021) that outline the types of distribution shifts needed for perfect ACL in linear models.
no code implementations • 6 Oct 2023 • Victor Akinwande, Yiding Jiang, Dylan Sam, J. Zico Kolter
Zero-shot learning in prompted vision-language models, the practice of crafting prompts to build classifiers without an explicit training process, has achieved impressive performance in many settings.
4 code implementations • 2 Oct 2023 • Andy Zou, Long Phan, Sarah Chen, James Campbell, Phillip Guo, Richard Ren, Alexander Pan, Xuwang Yin, Mantas Mazeika, Ann-Kathrin Dombrowski, Shashwat Goel, Nathaniel Li, Michael J. Byun, Zifan Wang, Alex Mallen, Steven Basart, Sanmi Koyejo, Dawn Song, Matt Fredrikson, J. Zico Kolter, Dan Hendrycks
In this paper, we identify and characterize the emerging area of representation engineering (RepE), an approach to enhancing the transparency of AI systems that draws on insights from cognitive neuroscience.
Ranked #3 on Question Answering on TruthfulQA
23 code implementations • 27 Jul 2023 • Andy Zou, Zifan Wang, Nicholas Carlini, Milad Nasr, J. Zico Kolter, Matt Fredrikson
Specifically, our approach finds a suffix that, when attached to a wide range of queries for an LLM to produce objectionable content, aims to maximize the probability that the model produces an affirmative response (rather than refusing to answer).
1 code implementation • 18 Jul 2023 • Pratyush Maini, Michael C. Mozer, Hanie Sedghi, Zachary C. Lipton, J. Zico Kolter, Chiyuan Zhang
Recent efforts at explaining the interplay of memorization and generalization in deep overparametrized networks have posited that neural networks $\textit{memorize}$ "hard" examples in the final few layers of the model.
no code implementations • 11 Jul 2023 • Zhili Feng, Ezra Winston, J. Zico Kolter
Deep Boltzmann machines (DBMs), one of the first ``deep'' learning methods ever studied, are multi-layered probabilistic models governed by a pairwise energy function that describes the likelihood of all variables/nodes in the network.
no code implementations • 10 Jul 2023 • Zhili Feng, Anna Bair, J. Zico Kolter
This method first automatically generates multiple visual descriptions of each class via a large language model (LLM), then uses a VLM to translate these descriptions to a set of visual feature embeddings of each image, and finally uses sparse logistic regression to select a relevant subset of these features to classify each image.
1 code implementation • 6 Jul 2023 • Pratyush Maini, Sachin Goyal, Zachary C. Lipton, J. Zico Kolter, aditi raghunathan
However, naively removing all such data could also be wasteful, as it throws away images that contain visual features (in addition to overlapping text).
no code implementations • 26 Jun 2023 • Yutong He, Ruslan Salakhutdinov, J. Zico Kolter
Despite the tremendous success in text-to-image generative models, localized text-to-image generation (that is, generating objects or features at specific locations in an image while maintaining a consistent overall generation) still requires either explicit training or substantial additional inference time.
6 code implementations • 20 Jun 2023 • MingJie Sun, Zhuang Liu, Anna Bair, J. Zico Kolter
Motivated by the recent observation of emergent large magnitude features in LLMs, our approach prunes weights with the smallest magnitudes multiplied by the corresponding input activations, on a per-output basis.
no code implementations • 7 Jun 2023 • Yiding Jiang, Christina Baek, J. Zico Kolter
Thus, we believe this work provides valuable new insight into our understanding of feature learning.
no code implementations • 16 May 2023 • Asher Trockman, J. Zico Kolter
It is notoriously difficult to train Transformers on small datasets; typically, large pre-trained models are instead used as the starting point.
no code implementations • 25 Apr 2023 • Samuel Sokota, Gabriele Farina, David J. Wu, Hengyuan Hu, Kevin A. Wang, J. Zico Kolter, Noam Brown
Using this framework, we derive a provably sound search algorithm for fully cooperative games based on mirror descent and a search algorithm for adversarial games based on magnetic mirror descent.
no code implementations • NeurIPS 2023 • Rattana Pukdee, Dylan Sam, J. Zico Kolter, Maria-Florina Balcan, Pradeep Ravikumar
In this paper, we formalize this notion as learning from explanation constraints and provide a learning theoretic framework to analyze how such explanations can improve the learning of our models.
no code implementations • 14 Mar 2023 • Mukul Bhutani, J. Zico Kolter
Predicting how distributions over discrete variables vary over time is a common task in time series forecasting.
1 code implementation • 13 Mar 2023 • Mrigank Raman, Pratyush Maini, J. Zico Kolter, Zachary C. Lipton, Danish Pruthi
Across 5 NLP datasets, 4 adversarial attacks, and 3 different models, MVP improves performance against adversarial substitutions by an average of 8% over standard methods and even outperforms adversarial training-based state-of-art defenses by 3. 5%.
1 code implementation • CVPR 2023 • MingJie Sun, J. Zico Kolter
Insipired by recent advances in adversarial robustness, our method SmoothInv starts from a single clean image, and then performs projected gradient descent towards the target class on a robust smoothed version of the original backdoored classifier.
2 code implementations • NeurIPS 2023 • Allan Zhou, KaiEn Yang, Kaylee Burns, Adriano Cardace, Yiding Jiang, Samuel Sokota, J. Zico Kolter, Chelsea Finn
The key building blocks of this framework are NF-Layers (neural functional layers) that we constrain to be permutation equivariant through an appropriate parameter sharing scheme.
no code implementations • 22 Jan 2023 • Samuel Sokota, Ryan D'Orazio, Chun Kai Ling, David J. Wu, J. Zico Kolter, Noam Brown
Because these regularized equilibria can be made arbitrarily close to Nash equilibria, our result opens the door to a new perspective to solving two-player zero-sum games and yields a simplified framework for decision-time planning in two-player zero-sum games, void of the unappealing properties that plague existing decision-time planning approaches.
1 code implementation • 29 Dec 2022 • Chun Kai Ling, J. Zico Kolter, Fei Fang
Function approximation (FA) has been a critical component in solving large zero-sum games.
1 code implementation • 13 Dec 2022 • Dylan Sam, J. Zico Kolter
Owing to the prohibitive costs of generating large amounts of labeled data, programmatic weak supervision is a growing paradigm within machine learning.
no code implementations • 26 Nov 2022 • Filipe de Avila Belbute-Peres, J. Zico Kolter
Neural networks with sinusoidal activations have been proposed as an alternative to networks with traditional activation functions.
1 code implementation • 26 Oct 2022 • Pratyush Maini, Saurabh Garg, Zachary C. Lipton, J. Zico Kolter
Popular metrics derived from these dynamics include (i) the epoch at which examples are first correctly classified; (ii) the number of times their predictions flip during training; and (iii) whether their prediction flips if they are held out.
2 code implementations • 24 Oct 2022 • Christian Schroeder de Witt, Samuel Sokota, J. Zico Kolter, Jakob Foerster, Martin Strohmeier
Steganography is the practice of encoding secret information into innocuous content in such a manner that an adversarial third party would not realize that there is hidden meaning.
no code implementations • 7 Oct 2022 • Asher Trockman, Devin Willmott, J. Zico Kolter
In this work, we first observe that such learned filters have highly-structured covariance matrices, and moreover, we find that covariances calculated from small networks may be used to effectively initialize a variety of larger networks of different depths, widths, patch sizes, and kernel sizes, indicating a degree of model-independence to the covariance structure.
3 code implementations • 11 Aug 2022 • huan zhang, Shiqi Wang, Kaidi Xu, Linyi Li, Bo Li, Suman Jana, Cho-Jui Hsieh, J. Zico Kolter
Our generalized bound propagation method, GCP-CROWN, opens up the opportunity to apply general cutting plane methods for neural network verification while benefiting from the efficiency and GPU acceleration of bound propagation methods.
2 code implementations • 21 Jun 2022 • Nicholas Carlini, Florian Tramer, Krishnamurthy Dj Dvijotham, Leslie Rice, MingJie Sun, J. Zico Kolter
In this paper we show how to achieve state-of-the-art certified adversarial robustness to 2-norm bounded perturbations by relying exclusively on off-the-shelf pretrained models.
3 code implementations • 12 Jun 2022 • Samuel Sokota, Ryan D'Orazio, J. Zico Kolter, Nicolas Loizou, Marc Lanctot, Ioannis Mitliagkas, Noam Brown, Christian Kroer
This work studies an algorithm, which we call magnetic mirror descent, that is inspired by mirror descent and the non-Euclidean proximal gradient algorithm.
no code implementations • 12 May 2022 • Ameya Joshi, Minh Pham, Minsu Cho, Leonid Boytsov, Filipe Condessa, J. Zico Kolter, Chinmay Hegde
Randomized smoothing (RS) has been shown to be a fast, scalable technique for certifying the robustness of deep neural network classifiers.
1 code implementation • CVPR 2022 • Shaojie Bai, Zhengyang Geng, Yash Savani, J. Zico Kolter
Many recent state-of-the-art (SOTA) optical flow models use finite-step recurrent update operations to emulate traditional algorithms by encouraging iterative refinements toward a stable flow estimation.
Ranked #1 on Optical Flow Estimation on KITTI 2015 (train)
11 code implementations • 24 Jan 2022 • Asher Trockman, J. Zico Kolter
Despite its simplicity, we show that the ConvMixer outperforms the ViT, MLP-Mixer, and some of their variants for similar parameter counts and data set sizes, in addition to outperforming classical vision models such as the ResNet.
Ranked #100 on Image Classification on CIFAR-10
no code implementations • NeurIPS 2021 • Samuel Sokota, Caleb Ho, Zaheen Ahmad, J. Zico Kolter
In this work, we present a method, called abstraction refining, for extending MCTS to stochastic environments which, unlike progressive widening, leverages the geometry of the state space.
no code implementations • NeurIPS 2021 • Leslie Rice, Anna Bair, huan zhang, J. Zico Kolter
Several recent works in machine learning have focused on evaluating the test-time robustness of a classifier: how well the classifier performs not just on the target domain it was trained upon, but upon perturbed examples.
no code implementations • NeurIPS 2021 • Zhichun Huang, Shaojie Bai, J. Zico Kolter
Recent research in deep learning has investigated two very different forms of ''implicitness'': implicit representations model high-frequency data such as images or 3D shapes directly via a low-dimensional neural network (often using e. g., sinusoidal bases or nonlinearities); implicit layers, in contrast, refer to techniques where the forward pass of a network is computed via non-linear dynamical systems, such as fixed-point or differential equation solutions, with the backward pass computed via the implicit function theorem.
1 code implementation • NeurIPS 2021 • Swaminathan Gurumurthy, Shaojie Bai, Zachary Manchester, J. Zico Kolter
Many tasks in deep learning involve optimizing over the \emph{inputs} to a network to minimize or maximize some objective; examples include optimization over latent spaces in a generative model to match a target image, or adversarially perturbing an input to worsen classifier performance.
no code implementations • NeurIPS 2021 • Priya L. Donti, Aayushya Agarwal, Neeraj Vijay Bedmutha, Larry Pileggi, J. Zico Kolter
In recent years, the ML community has seen surges of interest in both adversarially robust learning and implicit layers, but connections between these two areas have seldom been explored.
1 code implementation • 17 Jul 2021 • Samuel Sokota, Christian Schroeder de Witt, Maximilian Igl, Luisa Zintgraf, Philip Torr, Martin Strohmeier, J. Zico Kolter, Shimon Whiteson, Jakob Foerster
We contribute a theoretically grounded approach to MCGs based on maximum entropy reinforcement learning and minimum entropy coupling that we call MEME.
1 code implementation • 28 Jun 2021 • Shaojie Bai, Vladlen Koltun, J. Zico Kolter
Deep equilibrium networks (DEQs) are a new class of models that eschews traditional depth in favor of finding the fixed point of a single nonlinear layer.
no code implementations • ICLR 2022 • Yiding Jiang, Vaishnavh Nagarajan, Christina Baek, J. Zico Kolter
We empirically show that the test error of deep networks can be estimated by simply training the same architecture on the same training set but with a different run of Stochastic Gradient Descent (SGD), and measuring the disagreement rate between the two networks on unlabeled test data.
1 code implementation • 16 Jun 2021 • Shaoru Chen, Eric Wong, J. Zico Kolter, Mahyar Fazlyab
Analyzing the worst-case performance of deep neural networks against input perturbations amounts to solving a large-scale non-convex optimization problem, for which several past works have proposed convex relaxations as a promising alternative.
1 code implementation • 11 Jun 2021 • Runtian Zhai, Chen Dan, J. Zico Kolter, Pradeep Ravikumar
Many machine learning tasks involve subpopulation shift where the testing data distribution is a subpopulation of the training distribution.
1 code implementation • 19 May 2021 • Bingqing Chen, Priya Donti, Kyri Baker, J. Zico Kolter, Mario Berges
Specifically, we incorporate a differentiable projection layer within a neural network-based policy to enforce that all learned actions are feasible.
1 code implementation • 1 May 2021 • Saurabh Garg, Sivaraman Balakrishnan, J. Zico Kolter, Zachary C. Lipton
To assess generalization, machine learning scientists typically either (i) bound the generalization gap and then (after training) plug in the empirical risk to obtain a bound on the true risk; or (ii) validate empirically on holdout data.
1 code implementation • ICLR 2021 • Priya L. Donti, David Rolnick, J. Zico Kolter
Large optimization problems with hard constraints arise in many settings, yet classical solvers are often prohibitively slow, motivating the use of deep networks as cheap "approximate solvers."
1 code implementation • ICLR 2021 • Asher Trockman, J. Zico Kolter
Recent work has highlighted several advantages of enforcing orthogonality in the weight layers of deep networks, such as maintaining the stability of activations, preserving gradient norms, and enhancing adversarial robustness by enforcing low Lipschitz constants.
5 code implementations • NeurIPS 2021 • Shiqi Wang, huan zhang, Kaidi Xu, Xue Lin, Suman Jana, Cho-Jui Hsieh, J. Zico Kolter
Compared to the typically tightest but very costly semidefinite programming (SDP) based incomplete verifiers, we obtain higher verified accuracy with three orders of magnitudes less verification time.
1 code implementation • ICLR 2021 • Jeremy M. Cohen, Simran Kaur, Yuanzhi Li, J. Zico Kolter, Ameet Talwalkar
We empirically demonstrate that full-batch gradient descent on neural network training objectives typically operates in a regime we call the Edge of Stability.
no code implementations • 20 Feb 2021 • Saurabh Garg, Joshua Zhanson, Emilio Parisotto, Adarsh Prasad, J. Zico Kolter, Zachary C. Lipton, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Pradeep Ravikumar
In this paper, we present a detailed empirical study to characterize the heavy-tailed nature of the gradients of the PPO surrogate reward function.
1 code implementation • NeurIPS 2020 • Chun Kai Ling, Fei Fang, J. Zico Kolter
A central problem in machine learning and statistics is to model joint densities of random variables from data.
1 code implementation • NeurIPS 2020 • Chirag Pabbaraju, Po-Wei Wang, J. Zico Kolter
Probabilistic inference in pairwise Markov Random Fields (MRFs), i. e. computing the partition function or computing a MAP estimate of the variables, is a foundational problem in probabilistic graphical models.
1 code implementation • NeurIPS 2020 • Po-Wei Wang, J. Zico Kolter
Modularity maximization has been a fundamental tool for understanding the community structure of a network, but the underlying optimization problem is nonconvex and NP-hard to solve.
1 code implementation • 4 Dec 2020 • Jonathan Dinu, Jeffrey Bigham, J. Zico Kolter
As machine learning and algorithmic decision making systems are increasingly being leveraged in high-stakes human-in-the-loop settings, there is a pressing need to understand the rationale of their predictions.
Decision Making Explainable Artificial Intelligence (XAI) +1
1 code implementation • ICLR 2021 • Priya L. Donti, Melrose Roderick, Mahyar Fazlyab, J. Zico Kolter
When designing controllers for safety-critical systems, practitioners often face a challenging tradeoff between robustness and performance.
1 code implementation • 18 Oct 2020 • MingJie Sun, Siddhant Agarwal, J. Zico Kolter
Under this threat model, we propose a test-time, human-in-the-loop attack method to generate multiple effective alternative triggers without access to the initial backdoor and the training data.
1 code implementation • 8 Oct 2020 • Anit Kumar Sahu, Satya Narayan Shukla, J. Zico Kolter
We study the problem of generating adversarial examples in a black-box setting, where we only have access to a zeroth order oracle, providing us with loss function evaluations.
1 code implementation • ICLR 2021 • Eric Wong, J. Zico Kolter
In this paper, we aim to bridge this gap by learning perturbation sets from data, in order to characterize real-world effects for robust training and evaluation.
1 code implementation • 13 Jul 2020 • Satya Narayan Shukla, Anit Kumar Sahu, Devin Willmott, J. Zico Kolter
We focus on the problem of black-box adversarial attacks, where the aim is to generate adversarial examples for deep learning models solely based on information limited to output label~(hard label) to a queried data input.
2 code implementations • ICML 2020 • Filipe de Avila Belbute-Peres, Thomas D. Economon, J. Zico Kolter
Solving large complex partial differential equations (PDEs), such as those that arise in computational fluid dynamics (CFD), is a computationally expensive process.
1 code implementation • 7 Jul 2020 • Melrose Roderick, Vaishnavh Nagarajan, J. Zico Kolter
A key challenge in applying reinforcement learning to safety-critical domains is understanding how to balance exploration (needed to attain good performance on the task) with safety (needed to avoid catastrophic failure).
no code implementations • 30 Jun 2020 • Eric Wong, Tim Schneider, Joerg Schmitt, Frank R. Schmidt, J. Zico Kolter
Additionally, we show how specific intervals of fuel injection quantities can be targeted to maximize robustness for certain ranges, allowing us to train a virtual sensor for fuel injection which is provably guaranteed to have at most 10. 69% relative error under noise while maintaining 3% relative error on non-adversarial data within normalized fuel injection ranges of 0. 6 to 1. 0.
1 code implementation • NeurIPS 2020 • Ezra Winston, J. Zico Kolter
We then develop a parameterization of the network which ensures that all operators remain monotone, which guarantees the existence of a unique equilibrium point.
4 code implementations • NeurIPS 2020 • Shaojie Bai, Vladlen Koltun, J. Zico Kolter
These simultaneously-learned multi-resolution features allow us to train a single model on a diverse set of tasks and loss functions, such as using a single MDEQ to perform both image classification and semantic segmentation.
Ranked #50 on Semantic Segmentation on Cityscapes val
no code implementations • ICLR 2020 • Po-Wei Wang, Daria Stepanova, Csaba Domokos, J. Zico Kolter
Rules over a knowledge graph (KG) capture interpretable patterns in data and can be used for KG cleaning and completion.
4 code implementations • NeurIPS 2020 • Hadi Salman, Ming-Jie Sun, Greg Yang, Ashish Kapoor, J. Zico Kolter
We present a method for provably defending any pretrained image classifier against $\ell_p$ adversarial attacks.
4 code implementations • ICML 2020 • Leslie Rice, Eric Wong, J. Zico Kolter
Based upon this observed effect, we show that the performance gains of virtually all recent algorithmic improvements upon adversarial training can be matched by simply using early stopping.
no code implementations • ICML 2020 • Elan Rosenfeld, Ezra Winston, Pradeep Ravikumar, J. Zico Kolter
Machine learning algorithms are known to be susceptible to data poisoning attacks, where an adversary manipulates the training data to degrade performance of the resulting classifier.
1 code implementation • NeurIPS 2019 • Gaurav Manek, J. Zico Kolter
Deep networks are commonly used to model dynamical systems, predicting how the state of a system will evolve over time (either autonomously or in response to control inputs).
11 code implementations • ICLR 2020 • Eric Wong, Leslie Rice, J. Zico Kolter
Furthermore we show that FGSM adversarial training can be further accelerated by using standard techniques for efficient training of deep networks, allowing us to learn a robust CIFAR10 classifier with 45% robust accuracy to PGD attacks with $\epsilon=8/255$ in 6 minutes, and a robust ImageNet classifier with 43% robust accuracy at $\epsilon=2/255$ in 12 hours, in comparison to past work based on "free" adversarial training which took 10 and 50 hours to reach the same respective thresholds.
4 code implementations • 2 Dec 2019 • Rizal Fathony, J. Zico Kolter
We propose a method that enables practitioners to conveniently incorporate custom non-decomposable performance metrics into differentiable learning pipelines, notably those based upon neural network architectures.
no code implementations • 15 Nov 2019 • Joshua Williams, J. Zico Kolter
Recent studies on fairness in automated decision making systems have both investigated the potential future impact of these decisions on the population at large, and emphasized that imposing ''typical'' fairness constraints such as demographic parity or equality of opportunity does not guarantee a benefit to disadvantaged groups.
no code implementations • NeurIPS 2019 • Juncheng B. Li, Shuhui Qu, Xinjian Li, Joseph Szurley, J. Zico Kolter, Florian Metze
In this work, we target our attack on the wake-word detection system, jamming the model with some inconspicuous background music to deactivate the VAs while our audio adversary is present.
1 code implementation • 30 Sep 2019 • Satya Narayan Shukla, Anit Kumar Sahu, Devin Willmott, J. Zico Kolter
We focus on the problem of black-box adversarial attacks, where the aim is to generate adversarial examples using information limited to loss function evaluations of input-output pairs.
no code implementations • 25 Sep 2019 • Elan Rosenfeld, Ezra Winston, Pradeep Ravikumar, J. Zico Kolter
This paper considers label-flipping attacks, a type of data poisoning attack where an adversary relabels a small number of examples in a training set in order to degrade the performance of the resulting classifier.
1 code implementation • 9 Sep 2019 • Pratyush Maini, Eric Wong, J. Zico Kolter
Owing to the susceptibility of deep learning systems to adversarial attacks, there has been a great deal of work in developing (both empirically and certifiably) robust classifiers.
11 code implementations • NeurIPS 2019 • Shaojie Bai, J. Zico Kolter, Vladlen Koltun
We present a new approach to modeling sequential data: the deep equilibrium model (DEQ).
Ranked #29 on Language Modelling on Penn Treebank (Word Level)
1 code implementation • 20 Jun 2019 • Brandon Amos, Vladlen Koltun, J. Zico Kolter
We propose the Limited Multi-Label (LML) projection layer as a new primitive operation for end-to-end learning systems.
no code implementations • 14 Jun 2019 • Joseph Szurley, J. Zico Kolter
Recent work has shown the possibility of adversarial attacks on automatic speechrecognition (ASR) systems.
Audio and Speech Processing Sound
4 code implementations • ACL 2019 • Yao-Hung Hubert Tsai, Shaojie Bai, Paul Pu Liang, J. Zico Kolter, Louis-Philippe Morency, Ruslan Salakhutdinov
Human language is often multimodal, which comprehends a mixture of natural language, facial gestures, and acoustic behaviors.
Ranked #6 on Multimodal Sentiment Analysis on MOSI
no code implementations • ICLR 2019 • Vaishnavh Nagarajan, J. Zico Kolter
The ability of overparameterized deep networks to generalize well has been linked to the fact that stochastic gradient descent (SGD) finds solutions that lie in flat, wide minima in the training loss -- minima where the output of the network is resilient to small random noise added to its parameters.
1 code implementation • 21 Mar 2019 • Juncheng Li, Frank R. Schmidt, J. Zico Kolter
In this work, we consider an alternative question: is it possible to fool deep classifiers, over all perceived objects of a certain type, by physically manipulating the camera itself?
no code implementations • 11 Mar 2019 • Chun Kai Ling, Fei Fang, J. Zico Kolter
With the recent advances in solving large, zero-sum extensive form games, there is a growing interest in the inverse problem of inferring underlying game parameters given only access to agent actions.
2 code implementations • 21 Feb 2019 • Eric Wong, Frank R. Schmidt, J. Zico Kolter
In this paper, we propose a new threat model for adversarial attacks based on the Wasserstein distance.
1 code implementation • NeurIPS 2019 • Vaishnavh Nagarajan, J. Zico Kolter
Aimed at explaining the surprisingly good generalization behavior of overparameterized deep networks, recent works have developed a variety of generalization bounds for deep learning, all based on the fundamental learning-theoretic technique of uniform convergence.
11 code implementations • 8 Feb 2019 • Jeremy M Cohen, Elan Rosenfeld, J. Zico Kolter
We show how to turn any classifier that classifies well under Gaussian noise into a new classifier that is certifiably robust to adversarial perturbations under the $\ell_2$ norm.
Ranked #3 on Robust classification on CIFAR-10
no code implementations • 7 Jan 2019 • Vaishnavh Nagarajan, J. Zico Kolter
Why does training deep neural networks using stochastic gradient descent (SGD) result in a generalization error that does not worsen with the number of parameters in the network?
1 code implementation • 15 Dec 2018 • Po-Wei Wang, J. Zico Kolter
This paper proposes a new algorithm for solving MAX2SAT problems based on combining search methods with semidefinite programming approaches.
1 code implementation • NeurIPS 2018 • Filipe de Avila Belbute-Peres, Kevin Smith, Kelsey Allen, Josh Tenenbaum, J. Zico Kolter
We present a differentiable physics engine that can be integrated as a module in deep neural networks for end-to-end learning.
2 code implementations • NeurIPS 2018 • Brandon Amos, Ivan Dario Jimenez Rodriguez, Jacob Sacks, Byron Boots, J. Zico Kolter
We present foundations for using Model Predictive Control (MPC) as a differentiable policy class for reinforcement learning in continuous state and action spaces.
no code implementations • 23 Oct 2018 • Alnur Ali, J. Zico Kolter, Ryan J. Tibshirani
Our primary focus is to compare the risk of gradient flow to that of ridge regression.
1 code implementation • ICLR 2019 • Shaojie Bai, J. Zico Kolter, Vladlen Koltun
On the other hand, we show that truncated recurrent networks are equivalent to trellis networks with special sparsity structure in their weight matrices.
4 code implementations • NeurIPS 2018 • Eric Wong, Frank R. Schmidt, Jan Hendrik Metzen, J. Zico Kolter
Recent work has developed methods for learning deep network classifiers that are provably robust to norm-bounded adversarial perturbation; however, these methods are currently only possible for relatively small feedforward networks.
1 code implementation • 7 May 2018 • Chun Kai Ling, Fei Fang, J. Zico Kolter
Although recent work in AI has made great progress in solving large, zero-sum, extensive-form games, the underlying assumption in most past work is that the parameters of the game itself are known to the agents.
33 code implementations • 4 Mar 2018 • Shaojie Bai, J. Zico Kolter, Vladlen Koltun
Our results indicate that a simple convolutional architecture outperforms canonical recurrent networks such as LSTMs across a diverse range of tasks and datasets, while demonstrating longer effective memory.
Ranked #4 on Music Modeling on Nottingham
no code implementations • ICLR 2018 • Po-Wei Wang, J. Zico Kolter, Vijai Mohan, Inderjit S. Dhillon
Search engine users nowadays heavily depend on query completion and correction to shape their queries.
no code implementations • ICLR 2018 • Shaojie Bai, J. Zico Kolter, Vladlen Koltun
This paper revisits the problem of sequence modeling using convolutional architectures.
Ranked #84 on Language Modelling on WikiText-103
8 code implementations • ICML 2018 • Eric Wong, J. Zico Kolter
We propose a method to learn deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations on the training data.
1 code implementation • NeurIPS 2017 • Vaishnavh Nagarajan, J. Zico Kolter
Despite the growing prominence of generative adversarial networks (GANs), optimization in GANs is still a poorly understood topic.
1 code implementation • 1 Jun 2017 • Po-Wei Wang, Wei-Cheng Chang, J. Zico Kolter
In this paper, we propose a low-rank coordinate descent approach to structured semidefinite programming with diagonal constraints.
1 code implementation • NeurIPS 2017 • Priya L. Donti, Brandon Amos, J. Zico Kolter
With the increasing popularity of machine learning techniques, it has become common to see prediction algorithms operating within some larger process.
7 code implementations • ICML 2017 • Brandon Amos, J. Zico Kolter
This paper presents OptNet, a network architecture that integrates optimization problems (here, specifically in the form of quadratic programs) as individual layers in larger end-to-end trainable deep networks.
3 code implementations • ICML 2017 • Brandon Amos, Lei Xu, J. Zico Kolter
We show that many existing neural network architectures can be made input-convex with a minor modification, and develop specialized optimization algorithms tailored to this setting.
no code implementations • 16 Nov 2015 • Matt Wytock, J. Zico Kolter
We present a convex approach to probabilistic segmentation and modeling of time series data.
no code implementations • 18 Dec 2013 • Matt Wytock, J. Zico Kolter
We propose a new framework for single-channel source separation that lies between the fully supervised and unsupervised setting.