no code implementations • 1 Sep 2023 • Neel Jain, Avi Schwarzschild, Yuxin Wen, Gowthami Somepalli, John Kirchenbauer, Ping-Yeh Chiang, Micah Goldblum, Aniruddha Saha, Jonas Geiping, Tom Goldstein
We find that the weakness of existing discrete optimizers for text, combined with the relatively high costs of optimization, makes standard adaptive attacks more challenging for LLMs.
no code implementations • 29 Jun 2023 • Khalid Saifullah, Yuxin Wen, Jonas Geiping, Micah Goldblum, Tom Goldstein
Neural networks for computer vision extract uninterpretable features despite achieving high accuracy on benchmarks.
1 code implementation • 23 Jun 2023 • Neel Jain, Khalid Saifullah, Yuxin Wen, John Kirchenbauer, Manli Shu, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein
With the rise of Large Language Models (LLMs) and their ubiquitous deployment in diverse domains, measuring language model behavior on realistic data is imperative.
1 code implementation • 7 Jun 2023 • John Kirchenbauer, Jonas Geiping, Yuxin Wen, Manli Shu, Khalid Saifullah, Kezhi Kong, Kasun Fernando, Aniruddha Saha, Micah Goldblum, Tom Goldstein
We also consider a range of new detection schemes that are sensitive to short spans of watermarked text embedded inside a large document, and we compare the robustness of watermarking to other kinds of detectors.
1 code implementation • 31 May 2023 • Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, Tom Goldstein
While it is widely believed that duplicated images in the training set are responsible for content replication at inference time, we observe that the text conditioning of the model plays a similarly important role.
no code implementations • 30 May 2023 • Pedro Sandoval-Segura, Vasu Singla, Jonas Geiping, Micah Goldblum, Tom Goldstein
First, it is widely believed that neural networks trained on unlearnable datasets only learn shortcuts, simpler rules that are not useful for generalization.
2 code implementations • 4 May 2023 • Duncan McElfresh, Sujay Khandagale, Jonathan Valverde, Vishak Prasad C, Ganesh Ramakrishnan, Micah Goldblum, Colin White
Tabular data is one of the most commonly used types of data in machine learning.
1 code implementation • 24 Apr 2023 • Randall Balestriero, Mark Ibrahim, Vlad Sobal, Ari Morcos, Shashank Shekhar, Tom Goldstein, Florian Bordes, Adrien Bardes, Gregoire Mialon, Yuandong Tian, Avi Schwarzschild, Andrew Gordon Wilson, Jonas Geiping, Quentin Garrido, Pierre Fernandez, Amir Bar, Hamed Pirsiavash, Yann Lecun, Micah Goldblum
Self-supervised learning, dubbed the dark matter of intelligence, is a promising path to advance machine learning.
1 code implementation • 11 Apr 2023 • Micah Goldblum, Marc Finzi, Keefer Rowan, Andrew Gordon Wilson
No free lunch theorems for supervised learning state that no learner can solve all problems or that all learners achieve exactly the same accuracy on average over a uniform distribution on learning problems.
1 code implementation • 14 Feb 2023 • Arpit Bansal, Hong-Min Chu, Avi Schwarzschild, Soumyadip Sengupta, Micah Goldblum, Jonas Geiping, Tom Goldstein
Typical diffusion models are trained to accept a particular form of conditioning, most commonly text, and cannot be conditioned on other modalities without retraining.
1 code implementation • 7 Feb 2023 • Yuxin Wen, Neel Jain, John Kirchenbauer, Micah Goldblum, Jonas Geiping, Tom Goldstein
In the text-to-image setting, the method creates hard prompts for diffusion models, allowing API users to easily generate, discover, and mix and match image concepts without prior knowledge on how to prompt the model.
1 code implementation • 6 Feb 2023 • Yuancheng Xu, Yanchao Sun, Micah Goldblum, Tom Goldstein, Furong Huang
However, it is unclear whether existing robust training methods effectively increase the margin for each vulnerable point during training.
1 code implementation • 13 Dec 2022 • Amin Ghiasi, Hamid Kazemi, Eitan Borgnia, Steven Reich, Manli Shu, Micah Goldblum, Andrew Gordon Wilson, Tom Goldstein
In addition, we show that ViTs maintain spatial information in all layers except the final layer.
no code implementations • CVPR 2023 • Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, Tom Goldstein
Cutting-edge diffusion models produce images with high quality and customizability, enabling them to be used for commercial art and graphic design purposes.
no code implementations • 28 Nov 2022 • Wanqian Yang, Polina Kirichenko, Micah Goldblum, Andrew Gordon Wilson
Deep neural networks are susceptible to shortcut learning, using simple features to achieve low training loss without discovering essential semantic structure.
1 code implementation • 24 Nov 2022 • Sanae Lotfi, Marc Finzi, Sanyam Kapoor, Andres Potapczynski, Micah Goldblum, Andrew Gordon Wilson
While there has been progress in developing non-vacuous generalization bounds for deep neural networks, these bounds tend to be uninformative about why deep learning works.
no code implementations • 23 Oct 2022 • Renkun Ni, Ping-Yeh Chiang, Jonas Geiping, Micah Goldblum, Andrew Gordon Wilson, Tom Goldstein
Sharpness-Aware Minimization (SAM) has recently emerged as a robust technique for improving the accuracy of deep neural networks.
1 code implementation • 19 Oct 2022 • Yuxin Wen, Arpit Bansal, Hamid Kazemi, Eitan Borgnia, Micah Goldblum, Jonas Geiping, Tom Goldstein
As industrial applications are increasingly automated by machine learning models, enforcing personal data ownership and intellectual property rights requires tracing training data back to their rightful owners.
1 code implementation • 18 Oct 2022 • Rhea Sukthanker, Samuel Dooley, John P. Dickerson, Colin White, Frank Hutter, Micah Goldblum
Motivated by our findings, we run the first neural architecture search for fairness, jointly with a search for hyperparameters.
1 code implementation • 17 Oct 2022 • Yuxin Wen, Jonas Geiping, Liam Fowl, Hossein Souri, Rama Chellappa, Micah Goldblum, Tom Goldstein
Federated learning is particularly susceptible to model poisoning and backdoor attacks because individual users have direct control over the training data and model updates.
1 code implementation • 12 Oct 2022 • Jonas Geiping, Micah Goldblum, Gowthami Somepalli, Ravid Shwartz-Ziv, Tom Goldstein, Andrew Gordon Wilson
Despite the clear performance benefits of data augmentations, little is known about why they are so effective.
1 code implementation • 6 Oct 2022 • Nate Gruver, Marc Finzi, Micah Goldblum, Andrew Gordon Wilson
In order to better understand the role of equivariance in recent vision models, we introduce the Lie derivative, a method for measuring equivariance with strong mathematical foundations and minimal hyperparameters.
3 code implementations • 19 Aug 2022 • Arpit Bansal, Eitan Borgnia, Hong-Min Chu, Jie S. Li, Hamid Kazemi, Furong Huang, Micah Goldblum, Jonas Geiping, Tom Goldstein
We observe that the generative behavior of diffusion models is not strongly dependent on the choice of image degradation, and in fact an entire family of generative models can be constructed by varying this choice.
1 code implementation • 30 Jun 2022 • Roman Levin, Valeriia Cherepanova, Avi Schwarzschild, Arpit Bansal, C. Bayan Bruss, Tom Goldstein, Andrew Gordon Wilson, Micah Goldblum
In this work, we demonstrate that upstream data gives tabular neural networks a decisive advantage over widely used GBDT models.
1 code implementation • 8 Jun 2022 • Pedro Sandoval-Segura, Vasu Singla, Jonas Geiping, Micah Goldblum, Tom Goldstein, David W. Jacobs
Unfortunately, existing methods require knowledge of both the target architecture and the complete dataset so that a surrogate network can be trained, the parameters of which are used to generate the attack.
1 code implementation • 20 May 2022 • Ravid Shwartz-Ziv, Micah Goldblum, Hossein Souri, Sanyam Kapoor, Chen Zhu, Yann Lecun, Andrew Gordon Wilson
Deep learning is increasingly moving towards a transfer learning paradigm whereby large foundation models are fine-tuned on downstream tasks, starting from an initialization learned on the source task.
no code implementations • 19 Apr 2022 • Pedro Sandoval-Segura, Vasu Singla, Liam Fowl, Jonas Geiping, Micah Goldblum, David Jacobs, Tom Goldstein
We advocate for evaluating poisons in terms of peak test accuracy.
no code implementations • 15 Mar 2022 • Valeriia Cherepanova, Steven Reich, Samuel Dooley, Hossein Souri, Micah Goldblum, Tom Goldstein
This is an unfortunate omission, as 'imbalance' is a more complex matter in identification; imbalance may arise in not only the training data, but also the testing data, and furthermore may affect the proportion of identities belonging to each demographic group or the number of images belonging to each identity.
1 code implementation • CVPR 2022 • Gowthami Somepalli, Liam Fowl, Arpit Bansal, Ping Yeh-Chiang, Yehuda Dar, Richard Baraniuk, Micah Goldblum, Tom Goldstein
We also use decision boundary methods to visualize double descent phenomena.
1 code implementation • 23 Feb 2022 • Sanae Lotfi, Pavel Izmailov, Gregory Benton, Micah Goldblum, Andrew Gordon Wilson
We provide a partial remedy through a conditional marginal likelihood, which we show is more aligned with generalization, and practically valuable for large-scale hyperparameter learning, such as in deep kernel learning.
1 code implementation • 11 Feb 2022 • Arpit Bansal, Avi Schwarzschild, Eitan Borgnia, Zeyad Emam, Furong Huang, Micah Goldblum, Tom Goldstein
Algorithmic extrapolation can be achieved through recurrent systems, which can be iterated many times to solve difficult reasoning problems.
1 code implementation • 1 Feb 2022 • Yuxin Wen, Jonas Geiping, Liam Fowl, Micah Goldblum, Tom Goldstein
Federated learning (FL) has rapidly risen in popularity due to its promise of privacy and efficiency.
1 code implementation • 31 Jan 2022 • Amin Ghiasi, Hamid Kazemi, Steven Reich, Chen Zhu, Micah Goldblum, Tom Goldstein
Existing techniques for model inversion typically rely on hard-to-tune regularizers, such as total variation or feature regularization, which must be individually calibrated for each network in order to produce adequate images.
1 code implementation • 29 Jan 2022 • Liam Fowl, Jonas Geiping, Steven Reich, Yuxin Wen, Wojtek Czaja, Micah Goldblum, Tom Goldstein
A central tenet of Federated learning (FL), which trains models without centralizing user data, is privacy.
1 code implementation • 25 Nov 2021 • Zeyad Ali Sami Emam, Hong-Min Chu, Ping-Yeh Chiang, Wojciech Czaja, Richard Leapman, Micah Goldblum, Tom Goldstein
Active learning (AL) algorithms aim to identify an optimal subset of data for annotation, such that deep neural networks (DNN) can achieve better performance when trained on this labeled subset.
2 code implementations • ICLR 2022 • Liam Fowl, Jonas Geiping, Wojtek Czaja, Micah Goldblum, Tom Goldstein
Federated learning has quickly gained popularity with its promises of increased user privacy and efficiency.
no code implementations • 15 Oct 2021 • Samuel Dooley, Ryan Downing, George Wei, Nathan Shankar, Bradon Thymes, Gudrun Thorkelsdottir, Tiye Kurtz-Miott, Rachel Mattson, Olufemi Obiwumi, Valeriia Cherepanova, Micah Goldblum, John P Dickerson, Tom Goldstein
Much recent research has uncovered and discussed serious concerns of bias in facial analysis technologies, finding performance disparities between groups of people based on perceived gender, skin type, lighting condition, etc.
no code implementations • 13 Oct 2021 • Hossein Souri, Pirazh Khorramshahi, Chun Pong Lau, Micah Goldblum, Rama Chellappa
The adversarial attack literature contains a myriad of algorithms for crafting perturbations which yield pathological behavior in neural networks.
1 code implementation • ICLR 2022 • Jonas Geiping, Micah Goldblum, Phillip E. Pope, Michael Moeller, Tom Goldstein
It is widely believed that the implicit regularization of SGD is fundamental to the impressive generalization behavior we observe in neural networks.
no code implementations • 29 Sep 2021 • Keshav Ganapathy, Emily Liu, Zain Zarger, Gowthami Somepalli, Micah Goldblum, Tom Goldstein
As machine learning conferences grow rapidly, many are concerned that individuals will be left behind on the basis of traits such as gender and geography.
no code implementations • 29 Sep 2021 • Liam H Fowl, Ping-Yeh Chiang, Micah Goldblum, Jonas Geiping, Arpit Amit Bansal, Wojciech Czaja, Tom Goldstein
These two behaviors can be in conflict as an organization wants to prevent competitors from using their own data to replicate the performance of their proprietary models.
no code implementations • 29 Sep 2021 • Eitan Borgnia, Jonas Geiping, Valeriia Cherepanova, Liam H Fowl, Arjun Gupta, Amin Ghiasi, Furong Huang, Micah Goldblum, Tom Goldstein
Data poisoning and backdoor attacks manipulate training data to induce security breaches in a victim model.
no code implementations • 29 Sep 2021 • Liam H Fowl, Micah Goldblum, Arjun Gupta, Amr Sharaf, Tom Goldstein
We validate and deploy this metric on both images and text.
no code implementations • 29 Sep 2021 • Arpit Bansal, Avi Schwarzschild, Eitan Borgnia, Zeyad Emam, Furong Huang, Micah Goldblum, Tom Goldstein
Classical machine learning systems perform best when they are trained and tested on the same distribution, and they lack a mechanism to increase model power after training is complete.
no code implementations • 29 Sep 2021 • Mucong Ding, Kezhi Kong, Jiuhai Chen, John Kirchenbauer, Micah Goldblum, David Wipf, Furong Huang, Tom Goldstein
We observe that in most cases, we need both a suitable domain generalization algorithm and a strong GNN backbone model to optimize out-of-distribution test performance.
no code implementations • ICLR 2022 • Renkun Ni, Manli Shu, Hossein Souri, Micah Goldblum, Tom Goldstein
Contrastive learning has recently taken off as a paradigm for learning from unlabeled data.
1 code implementation • 9 Sep 2021 • Zhipeng Wei, Jingjing Chen, Micah Goldblum, Zuxuan Wu, Tom Goldstein, Yu-Gang Jiang
We evaluate the transferability of attacks on state-of-the-art ViTs, CNNs and robustly trained CNNs.
1 code implementation • 13 Aug 2021 • Avi Schwarzschild, Eitan Borgnia, Arjun Gupta, Arpit Bansal, Zeyad Emam, Furong Huang, Micah Goldblum, Tom Goldstein
We describe new datasets for studying generalization from easy to hard examples.
1 code implementation • 3 Aug 2021 • Roman Levin, Manli Shu, Eitan Borgnia, Furong Huang, Micah Goldblum, Tom Goldstein
We find that samples which cause similar parameters to malfunction are semantically similar.
1 code implementation • NeurIPS 2021 • Liam Fowl, Micah Goldblum, Ping-Yeh Chiang, Jonas Geiping, Wojtek Czaja, Tom Goldstein
The adversarial machine learning literature is largely partitioned into evasion attacks on testing data and poisoning attacks on training data.
no code implementations • 17 Jun 2021 • Arpit Bansal, Micah Goldblum, Valeriia Cherepanova, Avi Schwarzschild, C. Bayan Bruss, Tom Goldstein
Class-imbalanced data, in which some classes contain far more samples than others, is ubiquitous in real-world applications.
1 code implementation • 16 Jun 2021 • Hossein Souri, Liam Fowl, Rama Chellappa, Micah Goldblum, Tom Goldstein
In contrast, the Hidden Trigger Backdoor Attack achieves poisoning without placing a trigger into the training data at all.
1 code implementation • NeurIPS 2021 • Avi Schwarzschild, Eitan Borgnia, Arjun Gupta, Furong Huang, Uzi Vishkin, Micah Goldblum, Tom Goldstein
In this work, we show that recurrent networks trained to solve simple problems with few recurrent steps can indeed solve much more complex problems simply by performing additional recurrences during inference.
7 code implementations • 2 Jun 2021 • Gowthami Somepalli, Micah Goldblum, Avi Schwarzschild, C. Bayan Bruss, Tom Goldstein
We devise a hybrid deep learning approach to solving tabular data problems.
1 code implementation • ICLR 2021 • Phillip Pope, Chen Zhu, Ahmed Abdelkader, Micah Goldblum, Tom Goldstein
We find that common natural image datasets indeed have very low intrinsic dimension relative to the high number of pixels in the images.
1 code implementation • 2 Mar 2021 • Eitan Borgnia, Jonas Geiping, Valeriia Cherepanova, Liam Fowl, Arjun Gupta, Amin Ghiasi, Furong Huang, Micah Goldblum, Tom Goldstein
The InstaHide method has recently been proposed as an alternative to DP training that leverages supposed privacy properties of the mixup augmentation, although without rigorous guarantees.
1 code implementation • 26 Feb 2021 • Jonas Geiping, Liam Fowl, Gowthami Somepalli, Micah Goldblum, Michael Moeller, Tom Goldstein
Data poisoning is a threat model in which a malicious actor tampers with training data to manipulate outcomes at inference time.
1 code implementation • ICLR 2022 • Avi Schwarzschild, Arjun Gupta, Amin Ghiasi, Micah Goldblum, Tom Goldstein
It is widely believed that deep neural networks contain layer specialization, wherein neural networks extract hierarchical features representing edges and patterns in shallow layers and complete objects in deeper layers.
no code implementations • 16 Feb 2021 • Liam Fowl, Ping-Yeh Chiang, Micah Goldblum, Jonas Geiping, Arpit Bansal, Wojtek Czaja, Tom Goldstein
Large organizations such as social media companies continually release data, for example user images.
no code implementations • 12 Feb 2021 • Valeriia Cherepanova, Vedant Nanda, Micah Goldblum, John P. Dickerson, Tom Goldstein
As machine learning algorithms have been widely deployed across applications, many concerns have been raised over the fairness of their predictions, especially in high stakes settings (such as facial recognition and medical imaging).
no code implementations • ICLR 2021 • Valeriia Cherepanova, Micah Goldblum, Harrison Foley, Shiyuan Duan, John Dickerson, Gavin Taylor, Tom Goldstein
Facial recognition systems are increasingly deployed by private corporations, government agencies, and contractors for consumer services and mass surveillance programs alike.
no code implementations • 1 Jan 2021 • Avi Schwarzschild, Micah Goldblum, Arjun Gupta, John P Dickerson, Tom Goldstein
Data poisoning and backdoor attacks manipulate training data in order to cause models to fail during inference.
no code implementations • 18 Dec 2020 • Micah Goldblum, Dimitris Tsipras, Chulin Xie, Xinyun Chen, Avi Schwarzschild, Dawn Song, Aleksander Madry, Bo Li, Tom Goldstein
As machine learning systems grow in scale, so do their training data requirements, forcing practitioners to automate and outsource the curation of training data in order to achieve state-of-the-art performance.
no code implementations • 24 Nov 2020 • David Tran, Alex Valtchanov, Keshav Ganapathy, Raymond Feng, Eric Slud, Micah Goldblum, Tom Goldstein
Members of the machine learning community are likely to overhear allegations ranging from randomness of acceptance decisions to institutional bias.
1 code implementation • 18 Nov 2020 • Eitan Borgnia, Valeriia Cherepanova, Liam Fowl, Amin Ghiasi, Jonas Geiping, Micah Goldblum, Tom Goldstein, Arjun Gupta
Data poisoning and backdoor attacks manipulate victim models by maliciously modifying training data.
1 code implementation • 14 Oct 2020 • Renkun Ni, Micah Goldblum, Amr Sharaf, Kezhi Kong, Tom Goldstein
Conventional image classifiers are trained by randomly sampling mini-batches of images.
no code implementations • 13 Oct 2020 • Liam Fowl, Micah Goldblum, Arjun Gupta, Amr Sharaf, Tom Goldstein
We validate and deploy this metric on both images and text.
1 code implementation • 11 Oct 2020 • David Tran, Alex Valtchanov, Keshav Ganapathy, Raymond Feng, Eric Slud, Micah Goldblum, Tom Goldstein
Members of the machine learning community are likely to overhear allegations ranging from randomness of acceptance decisions to institutional bias.
no code implementations • 28 Sep 2020 • Manli Shu, Zuxuan Wu, Micah Goldblum, Tom Goldstein
Adversarial training is the industry standard for producing models that are robust to small adversarial perturbations.
1 code implementation • NeurIPS 2021 • Manli Shu, Zuxuan Wu, Micah Goldblum, Tom Goldstein
We adapt adversarial training by directly perturbing feature statistics, rather than image pixels, to produce models that are robust to various unseen distributional shifts.
2 code implementations • 22 Jun 2020 • Avi Schwarzschild, Micah Goldblum, Arjun Gupta, John P. Dickerson, Tom Goldstein
Data poisoning and backdoor attacks manipulate training data in order to cause models to fail during inference.
no code implementations • 21 Feb 2020 • Micah Goldblum, Avi Schwarzschild, Ankit B. Patel, Tom Goldstein
Algorithmic trading systems are often completely automated, and deep learning is increasingly receiving attention in this domain.
1 code implementation • ICML 2020 • Micah Goldblum, Steven Reich, Liam Fowl, Renkun Ni, Valeriia Cherepanova, Tom Goldstein
In doing so, we introduce and verify several hypotheses for why meta-learned models perform better.
no code implementations • 18 Nov 2019 • Ping-Yeh Chiang, Jonas Geiping, Micah Goldblum, Tom Goldstein, Renkun Ni, Steven Reich, Ali Shafahi
State-of-the-art adversarial attacks on neural networks use expensive iterative methods and numerous random restarts from different initial points.
1 code implementation • NeurIPS 2020 • Micah Goldblum, Liam Fowl, Tom Goldstein
Previous work on adversarially robust neural networks for image classification requires large training sets and computationally expensive training procedures.
1 code implementation • ICLR 2020 • Micah Goldblum, Jonas Geiping, Avi Schwarzschild, Michael Moeller, Tom Goldstein
We empirically evaluate common assumptions about neural networks that are widely held by practitioners and theorists alike.
2 code implementations • NeurIPS Workshop ICBINB 2020 • W. Ronny Huang, Zeyad Emam, Micah Goldblum, Liam Fowl, Justin K. Terry, Furong Huang, Tom Goldstein
The power of neural networks lies in their ability to generalize to unseen data, yet the underlying reasons for this phenomenon remain elusive.
2 code implementations • 23 May 2019 • Micah Goldblum, Liam Fowl, Soheil Feizi, Tom Goldstein
In addition to producing small models with high test accuracy like conventional distillation, ARD also passes the superior robustness of large networks onto the student.