no code implementations • ICML 2020 • Karthik Abinav Sankararaman, Soham De, Zheng Xu, W. Ronny Huang, Tom Goldstein
Through novel theoretical and experimental results, we show how the neural net architecture affects gradient confusion, and thus the efficiency of training.
no code implementations • ICLR 2019 • Pengzhi Huang, Emre Gonultas, Said Medjkouh, Oscar Castaneda, Olav Tirkkonen, Tom Goldstein, Christoph Studer
In a number of practical applications that rely on dimensionality reduction, the dataset or measurement process provides valuable side information that can be incorporated when learning low-dimensional embeddings.
1 code implementation • 14 Oct 2024 • Alex Stein, Samuel Sharpe, Doron Bergman, Senthil Kumar, C. Bayan Bruss, John Dickerson, Tom Goldstein, Micah Goldblum
Moreover, these approaches often assume specific use-cases, for example that we know the labels of all historic events or that we only predict a pre-specified label and not the data's features themselves.
no code implementations • 27 Sep 2024 • Mucong Ding, ChengHao Deng, Jocelyn Choo, Zichu Wu, Aakriti Agrawal, Avi Schwarzschild, Tianyi Zhou, Tom Goldstein, John Langford, Anima Anandkumar, Furong Huang
While generalization over tasks from easy to hard is crucial to profile language models (LLMs), the datasets with fine-grained difficulty annotations for each problem across a broad range of complexity are still blank.
no code implementations • 24 Jul 2024 • Michael-Andrei Panaitescu-Liess, Zora Che, Bang An, Yuancheng Xu, Pankayaraj Pathmanathan, Souradip Chakraborty, Sicheng Zhu, Tom Goldstein, Furong Huang
Surprisingly, we find that watermarking adversely affects the success rate of MIAs, complicating the task of detecting copyrighted text in the pretraining dataset.
1 code implementation • 27 Jun 2024 • Colin White, Samuel Dooley, Manley Roberts, Arka Pal, Ben Feuer, Siddhartha Jain, Ravid Shwartz-Ziv, Neel Jain, Khalid Saifullah, Siddartha Naidu, Chinmay Hegde, Yann Lecun, Tom Goldstein, Willie Neiswanger, Micah Goldblum
In this work, we introduce a new benchmark for LLMs designed to be immune to both test set contamination and the pitfalls of LLM judging and human crowdsourcing.
1 code implementation • 14 Jun 2024 • Alex Hanson, Allen Tu, Vasu Singla, Mayuka Jayawardhana, Matthias Zwicker, Tom Goldstein
Recent advancements in novel view synthesis have enabled real-time rendering speeds and high reconstruction accuracy.
no code implementations • 14 Jun 2024 • Vasu Singla, Kaiyu Yue, Sukriti Paul, Reza Shirkavand, Mayuka Jayawardhana, Alireza Ganjdanesh, Heng Huang, Abhinav Bhatele, Gowthami Somepalli, Tom Goldstein
Training large vision-language models requires extensive, high-quality image-text pairs.
1 code implementation • 14 Jun 2024 • Abhimanyu Hans, Yuxin Wen, Neel Jain, John Kirchenbauer, Hamid Kazemi, Prajwal Singhania, Siddharth Singh, Gowthami Somepalli, Jonas Geiping, Abhinav Bhatele, Tom Goldstein
Large language models can memorize and repeat their training data, causing privacy and copyright risks.
no code implementations • 14 Jun 2024 • Jiuhai Chen, Rifaa Qadri, Yuxin Wen, Neel Jain, John Kirchenbauer, Tianyi Zhou, Tom Goldstein
Most public instruction finetuning datasets are relatively small compared to the closed source datasets used to train industry models.
no code implementations • 11 Jun 2024 • Lichang Chen, Jiuhai Chen, Chenxi Liu, John Kirchenbauer, Davit Soselia, Chen Zhu, Tom Goldstein, Tianyi Zhou, Heng Huang
In this paper, we propose a more efficient data exploration strategy for online preference tuning (OPTune), which does not rely on human-curated or pre-collected teacher responses but dynamically samples informative responses for on-policy preference alignment.
2 code implementations • 6 Jun 2024 • Larisa Markeeva, Sean McLeish, Borja Ibarz, Wilfried Bounsi, Olga Kozlova, Alex Vitvitskyi, Charles Blundell, Tom Goldstein, Avi Schwarzschild, Petar Veličković
Three years ago, a similar issue was identified and rectified in the field of neural algorithmic reasoning, with the advent of the CLRS benchmark.
1 code implementation • 27 May 2024 • Sean McLeish, Arpit Bansal, Alex Stein, Neel Jain, John Kirchenbauer, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Jonas Geiping, Avi Schwarzschild, Tom Goldstein
The poor performance of transformers on arithmetic tasks seems to stem in large part from their inability to keep track of the exact position of each digit inside of a large span of digits.
2 code implementations • 24 May 2024 • Xiyao Wang, Jiuhai Chen, Zhaoyang Wang, YuHang Zhou, Yiyang Zhou, Huaxiu Yao, Tianyi Zhou, Tom Goldstein, Parminder Bhatia, Furong Huang, Cao Xiao
In this paper, we propose SIMA, a framework that enhances visual and language modality alignment through self-improvement, eliminating the needs for external models or data.
Ranked #153 on Visual Question Answering on MM-Vet
no code implementations • 14 May 2024 • Ruchit Rawal, Khalid Saifullah, Miquel Farré, Ronen Basri, David Jacobs, Gowthami Somepalli, Tom Goldstein
Current datasets for long-form video understanding often fall short of providing genuine long-form comprehension challenges, as many tasks derived from these datasets can be successfully tackled by analyzing just one or a few random frames from a video.
no code implementations • 10 May 2024 • John Kirchenbauer, Garrett Honke, Gowthami Somepalli, Jonas Geiping, Daphne Ippolito, Katherine Lee, Tom Goldstein, David Andre
We develop a methodology for analyzing language model task performance at the individual example level based on training data density estimation.
1 code implementation • 4 Apr 2024 • Sean McLeish, Avi Schwarzschild, Tom Goldstein
We evaluate ChatGPT's ability to solve algorithm problems from the CLRS benchmark suite that is designed for GNNs.
1 code implementation • 1 Apr 2024 • Gowthami Somepalli, Anubhav Gupta, Kamal Gupta, Shramay Palta, Micah Goldblum, Jonas Geiping, Abhinav Shrivastava, Tom Goldstein
We also propose a method to extract style descriptors that can be used to attribute style of a generated image to the images used in the training dataset of a text-to-image model.
no code implementations • 1 Apr 2024 • Yuxin Wen, Leo Marchyok, Sanghyun Hong, Jonas Geiping, Tom Goldstein, Nicholas Carlini
In this paper, we unveil a new vulnerability: the privacy backdoor attack.
1 code implementation • 25 Mar 2024 • Hossein Souri, Arpit Bansal, Hamid Kazemi, Liam Fowl, Aniruddha Saha, Jonas Geiping, Andrew Gordon Wilson, Rama Chellappa, Tom Goldstein, Micah Goldblum
As a result, we may be able to craft more potent poisons by carefully choosing the base samples.
1 code implementation • 5 Mar 2024 • Hamid Kazemi, Atoosa Chegini, Jonas Geiping, Soheil Feizi, Tom Goldstein
We employ an inversion-based approach to examine CLIP models.
1 code implementation • 21 Feb 2024 • Jonas Geiping, Alex Stein, Manli Shu, Khalid Saifullah, Yuxin Wen, Tom Goldstein
It has recently been shown that adversarial attacks on large language models (LLMs) can "jailbreak" the model into making harmful statements.
no code implementations • 11 Feb 2024 • Lichang Chen, Chen Zhu, Davit Soselia, Jiuhai Chen, Tianyi Zhou, Tom Goldstein, Heng Huang, Mohammad Shoeybi, Bryan Catanzaro
In this work, we study the issue of reward hacking on the response length, a challenge emerging in Reinforcement Learning from Human Feedback (RLHF) on LLMs.
1 code implementation • 5 Feb 2024 • Yuancheng Xu, Jiarui Yao, Manli Shu, Yanchao Sun, Zichu Wu, Ning Yu, Tom Goldstein, Furong Huang
Vision-Language Models (VLMs) excel in generating textual responses from visual inputs, but their versatility raises security concerns.
1 code implementation • 22 Jan 2024 • Abhimanyu Hans, Avi Schwarzschild, Valeriia Cherepanova, Hamid Kazemi, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein
Detecting text generated by modern large language models is thought to be hard, as both LLMs and humans can exhibit a wide range of complex behaviors.
1 code implementation • 16 Jan 2024 • Bang An, Mucong Ding, Tahseen Rabbani, Aakriti Agrawal, Yuancheng Xu, ChengHao Deng, Sicheng Zhu, Abdirisak Mohamed, Yuxin Wen, Tom Goldstein, Furong Huang
Our evaluation examines two pivotal dimensions: the degree of image quality degradation and the efficacy of watermark detection after attacks.
no code implementations • 26 Dec 2023 • Ping-Yeh Chiang, Yipin Zhou, Omid Poursaeed, Satya Narayan Shukla, Ashish Shah, Tom Goldstein, Ser-Nam Lim
Recently, Pyramid Adversarial training (Herrmann et al., 2022) has been shown to be very effective for improving clean accuracy and distribution-shift robustness of vision transformers.
no code implementations • 7 Dec 2023 • Micah Goldblum, Anima Anandkumar, Richard Baraniuk, Tom Goldstein, Kyunghyun Cho, Zachary C Lipton, Melanie Mitchell, Preetum Nakkiran, Max Welling, Andrew Gordon Wilson
The goal of this series is to chronicle opinions and issues in the field of machine learning as they stand today and as they change over time.
1 code implementation • CVPR 2024 • Kaiyu Yue, Bor-Chun Chen, Jonas Geiping, Hengduo Li, Tom Goldstein, Ser-Nam Lim
We present an approach to pose object recognition as next token prediction.
1 code implementation • 3 Nov 2023 • Vasu Singla, Pedro Sandoval-Segura, Micah Goldblum, Jonas Geiping, Tom Goldstein
Our approach serves as a simple and efficient baseline for data attribution on images.
2 code implementations • NeurIPS 2023 • Micah Goldblum, Hossein Souri, Renkun Ni, Manli Shu, Viraj Prabhu, Gowthami Somepalli, Prithvijit Chattopadhyay, Mark Ibrahim, Adrien Bardes, Judy Hoffman, Rama Chellappa, Andrew Gordon Wilson, Tom Goldstein
Battle of the Backbones (BoB) makes this choice easier by benchmarking a diverse suite of pretrained models, including vision-language models, those trained via self-supervised learning, and the Stable Diffusion backbone, across a diverse set of computer vision tasks ranging from classification to object detection to OOD generalization and more.
4 code implementations • 9 Oct 2023 • Neel Jain, Ping-Yeh Chiang, Yuxin Wen, John Kirchenbauer, Hong-Min Chu, Gowthami Somepalli, Brian R. Bartoldson, Bhavya Kailkhura, Avi Schwarzschild, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein
We show that language model finetuning can be improved, sometimes dramatically, with a simple augmentation.
1 code implementation • 1 Sep 2023 • Neel Jain, Avi Schwarzschild, Yuxin Wen, Gowthami Somepalli, John Kirchenbauer, Ping-Yeh Chiang, Micah Goldblum, Aniruddha Saha, Jonas Geiping, Tom Goldstein
We find that the weakness of existing discrete optimizers for text, combined with the relatively high costs of optimization, makes standard adaptive attacks more challenging for LLMs.
no code implementations • 29 Jun 2023 • Khalid Saifullah, Yuxin Wen, Jonas Geiping, Micah Goldblum, Tom Goldstein
Neural networks for computer vision extract uninterpretable features despite achieving high accuracy on benchmarks.
1 code implementation • NeurIPS 2023 • Manli Shu, Jiongxiao Wang, Chen Zhu, Jonas Geiping, Chaowei Xiao, Tom Goldstein
In this work, we investigate how an adversary can exploit instruction tuning by injecting specific instruction-following examples into the training data that intentionally changes the model's behavior.
1 code implementation • 23 Jun 2023 • Neel Jain, Khalid Saifullah, Yuxin Wen, John Kirchenbauer, Manli Shu, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein
With the rise of Large Language Models (LLMs) and their ubiquitous deployment in diverse domains, measuring language model behavior on realistic data is imperative.
1 code implementation • 7 Jun 2023 • John Kirchenbauer, Jonas Geiping, Yuxin Wen, Manli Shu, Khalid Saifullah, Kezhi Kong, Kasun Fernando, Aniruddha Saha, Micah Goldblum, Tom Goldstein
We also consider a range of new detection schemes that are sensitive to short spans of watermarked text embedded inside a large document, and we compare the robustness of watermarking to other kinds of detectors.
2 code implementations • 5 Jun 2023 • Lichang Chen, Jiuhai Chen, Tom Goldstein, Heng Huang, Tianyi Zhou
Large language models~(LLMs) are instruction followers, but it can be challenging to find the best instruction for different situations, especially for black-box LLMs on which backpropagation is forbidden.
2 code implementations • 31 May 2023 • Yuxin Wen, John Kirchenbauer, Jonas Geiping, Tom Goldstein
The watermark embeds a pattern into the initial noise vector used for sampling.
1 code implementation • NeurIPS 2023 • Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, Tom Goldstein
While it is widely believed that duplicated images in the training set are responsible for content replication at inference time, we observe that the text conditioning of the model plays a similarly important role.
1 code implementation • NeurIPS 2023 • Pedro Sandoval-Segura, Vasu Singla, Jonas Geiping, Micah Goldblum, Tom Goldstein
First, it is widely believed that neural networks trained on unlearnable datasets only learn shortcuts, simpler rules that are not useful for generalization.
no code implementations • 24 Apr 2023 • Randall Balestriero, Mark Ibrahim, Vlad Sobal, Ari Morcos, Shashank Shekhar, Tom Goldstein, Florian Bordes, Adrien Bardes, Gregoire Mialon, Yuandong Tian, Avi Schwarzschild, Andrew Gordon Wilson, Jonas Geiping, Quentin Garrido, Pierre Fernandez, Amir Bar, Hamed Pirsiavash, Yann Lecun, Micah Goldblum
Self-supervised learning, dubbed the dark matter of intelligence, is a promising path to advance machine learning.
no code implementations • 5 Apr 2023 • Pedro Sandoval-Segura, Jonas Geiping, Tom Goldstein
Recently developed text-to-image diffusion models make it easy to edit or create high-quality images.
1 code implementation • 28 Feb 2023 • Alex Stein, Avi Schwarzschild, Michael Curry, Tom Goldstein, John Dickerson
It has been shown that neural networks can be used to approximate optimal mechanisms while satisfying the constraints that an auction be strategyproof and individually rational.
1 code implementation • 14 Feb 2023 • Arpit Bansal, Hong-Min Chu, Avi Schwarzschild, Soumyadip Sengupta, Micah Goldblum, Jonas Geiping, Tom Goldstein
Typical diffusion models are trained to accept a particular form of conditioning, most commonly text, and cannot be conditioned on other modalities without retraining.
2 code implementations • NeurIPS 2023 • Yuxin Wen, Neel Jain, John Kirchenbauer, Micah Goldblum, Jonas Geiping, Tom Goldstein
In the text-to-image setting, the method creates hard prompts for diffusion models, allowing API users to easily generate, discover, and mix and match image concepts without prior knowledge on how to prompt the model.
2 code implementations • 6 Feb 2023 • Yuancheng Xu, Yanchao Sun, Micah Goldblum, Tom Goldstein, Furong Huang
However, it is unclear whether existing robust training methods effectively increase the margin for each vulnerable point during training.
6 code implementations • 24 Jan 2023 • John Kirchenbauer, Jonas Geiping, Yuxin Wen, Jonathan Katz, Ian Miers, Tom Goldstein
Potential harms of large language models can be mitigated by watermarking model output, i. e., embedding signals into generated text that are invisible to humans but algorithmically detectable from a short span of tokens.
no code implementations • 6 Jan 2023 • Manli Shu, Le Xue, Ning Yu, Roberto Martín-Martín, Caiming Xiong, Tom Goldstein, Juan Carlos Niebles, ran Xu
By plugging our proposed modules into the state-of-the-art transformer-based 3D detectors, we improve the previous best results on both benchmarks, with more significant improvements on smaller objects.
1 code implementation • 28 Dec 2022 • Jonas Geiping, Tom Goldstein
Recent trends in language modeling have focused on increasing performance through scaling, and have resulted in an environment where training language models is out of reach for most researchers and practitioners.
1 code implementation • 13 Dec 2022 • Amin Ghiasi, Hamid Kazemi, Eitan Borgnia, Steven Reich, Manli Shu, Micah Goldblum, Andrew Gordon Wilson, Tom Goldstein
In addition, we show that ViTs maintain spatial information in all layers except the final layer.
no code implementations • CVPR 2023 • Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, Tom Goldstein
Cutting-edge diffusion models produce images with high quality and customizability, enabling them to be used for commercial art and graphic design purposes.
2 code implementations • 29 Nov 2022 • Samuel Dooley, George Z. Wei, Tom Goldstein, John P. Dickerson
Many existing algorithmic audits examine the performance of these systems on later stage elements of facial analysis systems like facial recognition and age, emotion, or perceived gender prediction; however, a core component to these systems has been vastly understudied from a fairness perspective: face detection, sometimes called face localization.
no code implementations • 23 Oct 2022 • Renkun Ni, Ping-Yeh Chiang, Jonas Geiping, Micah Goldblum, Andrew Gordon Wilson, Tom Goldstein
Sharpness-Aware Minimization (SAM) has recently emerged as a robust technique for improving the accuracy of deep neural networks.
1 code implementation • 19 Oct 2022 • Yuxin Wen, Arpit Bansal, Hamid Kazemi, Eitan Borgnia, Micah Goldblum, Jonas Geiping, Tom Goldstein
As industrial applications are increasingly automated by machine learning models, enforcing personal data ownership and intellectual property rights requires tracing training data back to their rightful owners.
1 code implementation • 17 Oct 2022 • Yuxin Wen, Jonas Geiping, Liam Fowl, Hossein Souri, Rama Chellappa, Micah Goldblum, Tom Goldstein
Federated learning is particularly susceptible to model poisoning and backdoor attacks because individual users have direct control over the training data and model updates.
1 code implementation • 12 Oct 2022 • Jonas Geiping, Micah Goldblum, Gowthami Somepalli, Ravid Shwartz-Ziv, Tom Goldstein, Andrew Gordon Wilson
Despite the clear performance benefits of data augmentations, little is known about why they are so effective.
1 code implementation • 15 Sep 2022 • Manli Shu, Weili Nie, De-An Huang, Zhiding Yu, Tom Goldstein, Anima Anandkumar, Chaowei Xiao
In evaluating cross-dataset generalization with unseen categories, TPT performs on par with the state-of-the-art approaches that use additional training data.
2 code implementations • NeurIPS 2023 • Arpit Bansal, Eitan Borgnia, Hong-Min Chu, Jie S. Li, Hamid Kazemi, Furong Huang, Micah Goldblum, Jonas Geiping, Tom Goldstein
We observe that the generative behavior of diffusion models is not strongly dependent on the choice of image degradation, and in fact an entire family of generative models can be constructed by varying this choice.
1 code implementation • 16 Jul 2022 • Arpit Bansal, Ping-Yeh Chiang, Michael Curry, Rajiv Jain, Curtis Wigington, Varun Manjunatha, John P Dickerson, Tom Goldstein
Watermarking is a commonly used strategy to protect creators' rights to digital images, videos and audio.
1 code implementation • 30 Jun 2022 • Roman Levin, Valeriia Cherepanova, Avi Schwarzschild, Arpit Bansal, C. Bayan Bruss, Tom Goldstein, Andrew Gordon Wilson, Micah Goldblum
In this work, we demonstrate that upstream data gives tabular neural networks a decisive advantage over widely used GBDT models.
no code implementations • 16 Jun 2022 • Jiuhai Chen, Jonas Mueller, Vassilis N. Ioannidis, Tom Goldstein, David Wipf
Graph Neural Networks (GNNs) with numerical node features and graph structure as inputs have demonstrated superior performance on various supervised learning tasks with graph data.
2 code implementations • 8 Jun 2022 • Pedro Sandoval-Segura, Vasu Singla, Jonas Geiping, Micah Goldblum, Tom Goldstein, David W. Jacobs
Unfortunately, existing methods require knowledge of both the target architecture and the complete dataset so that a surrogate network can be trained, the parameters of which are used to generate the attack.
no code implementations • 19 Apr 2022 • Pedro Sandoval-Segura, Vasu Singla, Liam Fowl, Jonas Geiping, Micah Goldblum, David Jacobs, Tom Goldstein
We advocate for evaluating poisons in terms of peak test accuracy.
no code implementations • 15 Mar 2022 • Valeriia Cherepanova, Steven Reich, Samuel Dooley, Hossein Souri, Micah Goldblum, Tom Goldstein
This is an unfortunate omission, as 'imbalance' is a more complex matter in identification; imbalance may arise in not only the training data, but also the testing data, and furthermore may affect the proportion of identities belonging to each demographic group or the number of images belonging to each identity.
1 code implementation • CVPR 2022 • Gowthami Somepalli, Liam Fowl, Arpit Bansal, Ping Yeh-Chiang, Yehuda Dar, Richard Baraniuk, Micah Goldblum, Tom Goldstein
We also use decision boundary methods to visualize double descent phenomena.
1 code implementation • 11 Feb 2022 • Arpit Bansal, Avi Schwarzschild, Eitan Borgnia, Zeyad Emam, Furong Huang, Micah Goldblum, Tom Goldstein
Algorithmic extrapolation can be achieved through recurrent systems, which can be iterated many times to solve difficult reasoning problems.
1 code implementation • 1 Feb 2022 • Yuxin Wen, Jonas Geiping, Liam Fowl, Micah Goldblum, Tom Goldstein
Federated learning (FL) has rapidly risen in popularity due to its promise of privacy and efficiency.
1 code implementation • 31 Jan 2022 • Amin Ghiasi, Hamid Kazemi, Steven Reich, Chen Zhu, Micah Goldblum, Tom Goldstein
Existing techniques for model inversion typically rely on hard-to-tune regularizers, such as total variation or feature regularization, which must be individually calibrated for each network in order to produce adequate images.
1 code implementation • 29 Jan 2022 • Liam Fowl, Jonas Geiping, Steven Reich, Yuxin Wen, Wojtek Czaja, Micah Goldblum, Tom Goldstein
A central tenet of Federated learning (FL), which trains models without centralizing user data, is privacy.
1 code implementation • 28 Jan 2022 • Aounon Kumar, Alexander Levine, Tom Goldstein, Soheil Feizi
Certified robustness in machine learning has primarily focused on adversarial perturbations of the input with a fixed attack budget for each point in the data distribution.
no code implementations • 25 Jan 2022 • Samuel Dooley, George Z. Wei, Tom Goldstein, John P. Dickerson
When we compare the size of these disparities to that of commercial models, we conclude that commercial models - in contrast to their relatively larger development budget and industry-level fairness commitments - are always as biased or more biased than an academic model.
no code implementations • 3 Jan 2022 • Harrison Foley, Liam Fowl, Tom Goldstein, Gavin Taylor
Data poisoning for reinforcement learning has historically focused on general performance degradation, and targeted attacks have been successful via perturbations that involve control of the victim's policy and rewards.
no code implementations • NeurIPS 2021 • Yu Shen, Laura Zheng, Manli Shu, Weizi Li, Tom Goldstein, Ming Lin
We introduce a simple yet effective framework for improving the robustness of learning algorithms against image corruptions for autonomous driving.
1 code implementation • 25 Nov 2021 • Zeyad Ali Sami Emam, Hong-Min Chu, Ping-Yeh Chiang, Wojciech Czaja, Richard Leapman, Micah Goldblum, Tom Goldstein
Active learning (AL) algorithms aim to identify an optimal subset of data for annotation, such that deep neural networks (DNN) can achieve better performance when trained on this labeled subset.
1 code implementation • NeurIPS 2021 • Mucong Ding, Kezhi Kong, Jingling Li, Chen Zhu, John P Dickerson, Furong Huang, Tom Goldstein
Our framework avoids the "neighbor explosion" problem of GNNs using quantized representations combined with a low-rank version of the graph convolution matrix.
Ranked #12 on Node Classification on Reddit
1 code implementation • 26 Oct 2021 • Jiuhai Chen, Jonas Mueller, Vassilis N. Ioannidis, Soji Adeshina, Yangkun Wang, Tom Goldstein, David Wipf
For supervised learning with tabular data, decision tree ensembles produced via boosting techniques generally dominate real-world applications involving iid training/test sets.
no code implementations • 26 Oct 2021 • Shishira R Maiya, Max Ehrlich, Vatsal Agarwal, Ser-Nam Lim, Tom Goldstein, Abhinav Shrivastava
Our analysis shows that adversarial examples are neither in high-frequency nor in low-frequency components, but are simply dataset dependent.
3 code implementations • ICLR 2022 • Liam Fowl, Jonas Geiping, Wojtek Czaja, Micah Goldblum, Tom Goldstein
Federated learning has quickly gained popularity with its promises of increased user privacy and efficiency.
no code implementations • 15 Oct 2021 • Samuel Dooley, Ryan Downing, George Wei, Nathan Shankar, Bradon Thymes, Gudrun Thorkelsdottir, Tiye Kurtz-Miott, Rachel Mattson, Olufemi Obiwumi, Valeriia Cherepanova, Micah Goldblum, John P Dickerson, Tom Goldstein
Much recent research has uncovered and discussed serious concerns of bias in facial analysis technologies, finding performance disparities between groups of people based on perceived gender, skin type, lighting condition, etc.
no code implementations • ICLR 2022 • Renkun Ni, Manli Shu, Hossein Souri, Micah Goldblum, Tom Goldstein
Contrastive learning has recently taken off as a paradigm for learning from unlabeled data.
1 code implementation • ICLR 2022 • Jonas Geiping, Micah Goldblum, Phillip E. Pope, Michael Moeller, Tom Goldstein
It is widely believed that the implicit regularization of SGD is fundamental to the impressive generalization behavior we observe in neural networks.
no code implementations • 29 Sep 2021 • Keshav Ganapathy, Emily Liu, Zain Zarger, Gowthami Somepalli, Micah Goldblum, Tom Goldstein
As machine learning conferences grow rapidly, many are concerned that individuals will be left behind on the basis of traits such as gender and geography.
no code implementations • 29 Sep 2021 • Mucong Ding, Kezhi Kong, Jiuhai Chen, John Kirchenbauer, Micah Goldblum, David Wipf, Furong Huang, Tom Goldstein
We observe that in most cases, we need both a suitable domain generalization algorithm and a strong GNN backbone model to optimize out-of-distribution test performance.
no code implementations • 29 Sep 2021 • Liam H Fowl, Ping-Yeh Chiang, Micah Goldblum, Jonas Geiping, Arpit Amit Bansal, Wojciech Czaja, Tom Goldstein
These two behaviors can be in conflict as an organization wants to prevent competitors from using their own data to replicate the performance of their proprietary models.
no code implementations • 29 Sep 2021 • Liam H Fowl, Micah Goldblum, Arjun Gupta, Amr Sharaf, Tom Goldstein
We validate and deploy this metric on both images and text.
no code implementations • 29 Sep 2021 • Eitan Borgnia, Jonas Geiping, Valeriia Cherepanova, Liam H Fowl, Arjun Gupta, Amin Ghiasi, Furong Huang, Micah Goldblum, Tom Goldstein
Data poisoning and backdoor attacks manipulate training data to induce security breaches in a victim model.
no code implementations • ICLR 2022 • Jiuhai Chen, Jonas Mueller, Vassilis N. Ioannidis, Soji Adeshina, Yangkun Wang, Tom Goldstein, David Wipf
Many practical modeling tasks require making predictions using tabular data composed of heterogeneous feature types (e. g., text-based, categorical, continuous, etc.).
no code implementations • 29 Sep 2021 • Arpit Bansal, Avi Schwarzschild, Eitan Borgnia, Zeyad Emam, Furong Huang, Micah Goldblum, Tom Goldstein
Classical machine learning systems perform best when they are trained and tested on the same distribution, and they lack a mechanism to increase model power after training is complete.
no code implementations • ICLR 2022 • Chen Zhu, Zheng Xu, Mingqing Chen, Jakub Konečný, Andrew Hard, Tom Goldstein
Federated learning has been deployed to train machine learning models from decentralized client data on mobile devices in practice.
2 code implementations • 9 Sep 2021 • Zhipeng Wei, Jingjing Chen, Micah Goldblum, Zuxuan Wu, Tom Goldstein, Yu-Gang Jiang
We evaluate the transferability of attacks on state-of-the-art ViTs, CNNs and robustly trained CNNs.
1 code implementation • 27 Aug 2021 • Samuel Dooley, Tom Goldstein, John P. Dickerson
Facial detection and analysis systems have been deployed by large companies and critiqued by scholars and activists for the past decade.
1 code implementation • 13 Aug 2021 • Avi Schwarzschild, Eitan Borgnia, Arjun Gupta, Arpit Bansal, Zeyad Emam, Furong Huang, Micah Goldblum, Tom Goldstein
We describe new datasets for studying generalization from easy to hard examples.
1 code implementation • 3 Aug 2021 • Roman Levin, Manli Shu, Eitan Borgnia, Furong Huang, Micah Goldblum, Tom Goldstein
We find that samples which cause similar parameters to malfunction are semantically similar.
3 code implementations • NeurIPS 2021 • Chen Zhu, Wei Ping, Chaowei Xiao, Mohammad Shoeybi, Tom Goldstein, Anima Anandkumar, Bryan Catanzaro
For instance, Transformer-LS achieves 0. 97 test BPC on enwik8 using half the number of parameters than previous method, while being faster and is able to handle 3x as long sequences compared to its full-attention version on the same hardware.
Ranked #1 on Language Modelling on enwik8 dev
2 code implementations • NeurIPS 2021 • Liam Fowl, Micah Goldblum, Ping-Yeh Chiang, Jonas Geiping, Wojtek Czaja, Tom Goldstein
The adversarial machine learning literature is largely partitioned into evasion attacks on testing data and poisoning attacks on training data.
no code implementations • 17 Jun 2021 • Arpit Bansal, Micah Goldblum, Valeriia Cherepanova, Avi Schwarzschild, C. Bayan Bruss, Tom Goldstein
Class-imbalanced data, in which some classes contain far more samples than others, is ubiquitous in real-world applications.
1 code implementation • 16 Jun 2021 • Hossein Souri, Liam Fowl, Rama Chellappa, Micah Goldblum, Tom Goldstein
In contrast, the Hidden Trigger Backdoor Attack achieves poisoning without placing a trigger into the training data at all.
no code implementations • 15 Jun 2021 • Michael J. Curry, Uro Lyi, Tom Goldstein, John Dickerson
We propose a new architecture to approximately learn incentive compatible, revenue-maximizing auctions from sampled valuations.
1 code implementation • NeurIPS 2021 • Avi Schwarzschild, Eitan Borgnia, Arjun Gupta, Furong Huang, Uzi Vishkin, Micah Goldblum, Tom Goldstein
In this work, we show that recurrent networks trained to solve simple problems with few recurrent steps can indeed solve much more complex problems simply by performing additional recurrences during inference.
7 code implementations • 2 Jun 2021 • Gowthami Somepalli, Micah Goldblum, Avi Schwarzschild, C. Bayan Bruss, Tom Goldstein
We devise a hybrid deep learning approach to solving tabular data problems.
1 code implementation • ICLR 2021 • Phillip Pope, Chen Zhu, Ahmed Abdelkader, Micah Goldblum, Tom Goldstein
We find that common natural image datasets indeed have very low intrinsic dimension relative to the high number of pixels in the images.
no code implementations • 25 Mar 2021 • Zuxuan Wu, Tom Goldstein, Larry S. Davis, Ser-Nam Lim
Many variants of adversarial training have been proposed, with most research focusing on problems with relatively few classes.
no code implementations • 15 Mar 2021 • Shivam Akhauri, Laura Zheng, Tom Goldstein, Ming Lin
Practical learning-based autonomous driving models must be capable of generalizing learned behaviors from simulated to real domains, and from training data to unseen domains with unusual image properties.
no code implementations • 7 Mar 2021 • Chen Chen, Kezhi Kong, Peihong Yu, Juan Luque, Tom Goldstein, Furong Huang
Randomized smoothing (RS) is an effective and scalable technique for constructing neural network classifiers that are certifiably robust to adversarial perturbations.
1 code implementation • 2 Mar 2021 • Eitan Borgnia, Jonas Geiping, Valeriia Cherepanova, Liam Fowl, Arjun Gupta, Amin Ghiasi, Furong Huang, Micah Goldblum, Tom Goldstein
The InstaHide method has recently been proposed as an alternative to DP training that leverages supposed privacy properties of the mixup augmentation, although without rigorous guarantees.
no code implementations • 26 Feb 2021 • Yu Shen, Laura Zheng, Manli Shu, Weizi Li, Tom Goldstein, Ming C. Lin
For safety of autonomous driving, vehicles need to be able to drive under various lighting, weather, and visibility conditions in different environments.
1 code implementation • 26 Feb 2021 • Jonas Geiping, Liam Fowl, Gowthami Somepalli, Micah Goldblum, Michael Moeller, Tom Goldstein
Data poisoning is a threat model in which a malicious actor tampers with training data to manipulate outcomes at inference time.
1 code implementation • ICLR 2022 • Avi Schwarzschild, Arjun Gupta, Amin Ghiasi, Micah Goldblum, Tom Goldstein
It is widely believed that deep neural networks contain layer specialization, wherein neural networks extract hierarchical features representing edges and patterns in shallow layers and complete objects in deeper layers.
1 code implementation • NeurIPS 2021 • Aounon Kumar, Tom Goldstein
We extend the scope of certifiable robustness to problems with more general and structured outputs like sets, images, language, etc.
no code implementations • 16 Feb 2021 • Liam Fowl, Ping-Yeh Chiang, Micah Goldblum, Jonas Geiping, Arpit Bansal, Wojtek Czaja, Tom Goldstein
Large organizations such as social media companies continually release data, for example user images.
2 code implementations • NeurIPS 2021 • Chen Zhu, Renkun Ni, Zheng Xu, Kezhi Kong, W. Ronny Huang, Tom Goldstein
Innovations in neural architectures have fostered significant breakthroughs in language modeling and computer vision.
Ranked #143 on Image Classification on CIFAR-10
no code implementations • 12 Feb 2021 • Valeriia Cherepanova, Vedant Nanda, Micah Goldblum, John P. Dickerson, Tom Goldstein
As machine learning algorithms have been widely deployed across applications, many concerns have been raised over the fairness of their predictions, especially in high stakes settings (such as facial recognition and medical imaging).
no code implementations • ICLR 2021 • Valeriia Cherepanova, Micah Goldblum, Harrison Foley, Shiyuan Duan, John Dickerson, Gavin Taylor, Tom Goldstein
Facial recognition systems are increasingly deployed by private corporations, government agencies, and contractors for consumer services and mass surveillance programs alike.
no code implementations • 1 Jan 2021 • Avi Schwarzschild, Micah Goldblum, Arjun Gupta, John P Dickerson, Tom Goldstein
Data poisoning and backdoor attacks manipulate training data in order to cause models to fail during inference.
no code implementations • ICLR 2021 • Renkun Ni, Hong-Min Chu, Oscar Castaneda, Ping-Yeh Chiang, Christoph Studer, Tom Goldstein
Low-precision neural networks represent both weights and activations with few bits, drastically reducing the multiplication complexity.
no code implementations • 1 Jan 2021 • Arpit Amit Bansal, Ping-Yeh Chiang, Michael Curry, Hossein Souri, Rama Chellappa, John P Dickerson, Rajiv Jain, Tom Goldstein
Watermarking is a commonly used strategy to protect creators' rights to digital images, videos and audio.
no code implementations • 1 Jan 2021 • Yu Shen, Laura Yu Zheng, Manli Shu, Weizi Li, Tom Goldstein, Ming Lin
To ensure the wide adoption and safety of autonomous driving, the vehicles need to be able to drive under various lighting, weather, and visibility conditions in different environments.
no code implementations • 18 Dec 2020 • Micah Goldblum, Dimitris Tsipras, Chulin Xie, Xinyun Chen, Avi Schwarzschild, Dawn Song, Aleksander Madry, Bo Li, Tom Goldstein
As machine learning systems grow in scale, so do their training data requirements, forcing practitioners to automate and outsource the curation of training data in order to achieve state-of-the-art performance.
no code implementations • NeurIPS 2020 • Ping-Yeh Chiang, Michael Curry, Ahmed Abdelkader, Aounon Kumar, John Dickerson, Tom Goldstein
Despite the vulnerability of object detectors to adversarial attacks, very few defenses are known to date.
no code implementations • 24 Nov 2020 • David Tran, Alex Valtchanov, Keshav Ganapathy, Raymond Feng, Eric Slud, Micah Goldblum, Tom Goldstein
Members of the machine learning community are likely to overhear allegations ranging from randomness of acceptance decisions to institutional bias.
1 code implementation • 18 Nov 2020 • Eitan Borgnia, Valeriia Cherepanova, Liam Fowl, Amin Ghiasi, Jonas Geiping, Micah Goldblum, Tom Goldstein, Arjun Gupta
Data poisoning and backdoor attacks manipulate victim models by maliciously modifying training data.
no code implementations • 24 Oct 2020 • Huimin Zeng, Chen Zhu, Tom Goldstein, Furong Huang
Adversarial Training is proved to be an efficient method to defend against adversarial examples, being one of the few defenses that withstand strong attacks.
3 code implementations • CVPR 2022 • Kezhi Kong, Guohao Li, Mucong Ding, Zuxuan Wu, Chen Zhu, Bernard Ghanem, Gavin Taylor, Tom Goldstein
Data augmentation helps neural networks generalize better by enlarging the training set, but it remains an open question how to effectively augment graph data to enhance the performance of GNNs (Graph Neural Networks).
Ranked #1 on Graph Property Prediction on ogbg-ppa
no code implementations • 14 Oct 2020 • Chen Zhu, Zheng Xu, Ali Shafahi, Manli Shu, Amin Ghiasi, Tom Goldstein
Further, we demonstrate that the compact structure and corresponding initialization from the Lottery Ticket Hypothesis can also help in data-free training.
1 code implementation • 14 Oct 2020 • Renkun Ni, Micah Goldblum, Amr Sharaf, Kezhi Kong, Tom Goldstein
Conventional image classifiers are trained by randomly sampling mini-batches of images.
no code implementations • 13 Oct 2020 • Kevin Kuo, Anthony Ostuni, Elizabeth Horishny, Michael J. Curry, Samuel Dooley, Ping-Yeh Chiang, Tom Goldstein, John P. Dickerson
Inspired by these advances, in this paper, we extend techniques for approximating auctions using deep learning to address concerns of fairness while maintaining high revenue and strong incentive guarantees.
no code implementations • 13 Oct 2020 • Liam Fowl, Micah Goldblum, Arjun Gupta, Amr Sharaf, Tom Goldstein
We validate and deploy this metric on both images and text.
1 code implementation • 11 Oct 2020 • David Tran, Alex Valtchanov, Keshav Ganapathy, Raymond Feng, Eric Slud, Micah Goldblum, Tom Goldstein
Members of the machine learning community are likely to overhear allegations ranging from randomness of acceptance decisions to institutional bias.
no code implementations • 28 Sep 2020 • Manli Shu, Zuxuan Wu, Micah Goldblum, Tom Goldstein
Adversarial training is the industry standard for producing models that are robust to small adversarial perturbations.
1 code implementation • NeurIPS 2021 • Manli Shu, Zuxuan Wu, Micah Goldblum, Tom Goldstein
We adapt adversarial training by directly perturbing feature statistics, rather than image pixels, to produce models that are robust to various unseen distributional shifts.
no code implementations • NeurIPS 2020 • Aounon Kumar, Alexander Levine, Soheil Feizi, Tom Goldstein
It uses the probabilities of predicting the top two most-likely classes around an input point under a smoothing distribution to generate a certified radius for a classifier's prediction.
no code implementations • 8 Sep 2020 • Oscar Castañeda, Sven Jacobsson, Giuseppe Durisi, Tom Goldstein, Christoph Studer
All-digital basestation (BS) architectures enable superior spectral efficiency compared to hybrid solutions in massive multi-user MIMO systems.
2 code implementations • ICLR 2021 • Jonas Geiping, Liam Fowl, W. Ronny Huang, Wojciech Czaja, Gavin Taylor, Michael Moeller, Tom Goldstein
We consider a particularly malicious poisoning attack that is both "from scratch" and "clean label", meaning we analyze an attack that successfully works against new, randomly initialized models, and is nearly imperceptible to humans, all while perturbing only a small fraction of the training data.
no code implementations • 26 Jul 2020 • Renkun Ni, Hong-Min Chu, Oscar Castañeda, Ping-Yeh Chiang, Christoph Studer, Tom Goldstein
Low-resolution neural networks represent both weights and activations with few bits, drastically reducing the multiplication complexity.
1 code implementation • 7 Jul 2020 • Ping-Yeh Chiang, Michael J. Curry, Ahmed Abdelkader, Aounon Kumar, John Dickerson, Tom Goldstein
While adversarial training can improve the empirical robustness of image classifiers, a direct extension to object detection is very expensive.
2 code implementations • 22 Jun 2020 • Avi Schwarzschild, Micah Goldblum, Arjun Gupta, John P. Dickerson, Tom Goldstein
Data poisoning and backdoor attacks manipulate training data in order to cause models to fail during inference.
1 code implementation • 21 Jun 2020 • Chen Zhu, Yu Cheng, Zhe Gan, Furong Huang, Jingjing Liu, Tom Goldstein
Adaptive gradient methods such as RMSProp and Adam use exponential moving estimate of the squared gradient to compute adaptive step sizes, achieving better convergence than SGD in face of noisy objectives.
no code implementations • NeurIPS 2020 • Michael J. Curry, Ping-Yeh Chiang, Tom Goldstein, John Dickerson
We focus on the RegretNet architecture, which can represent auctions with arbitrary numbers of items and participants; it is trained to be empirically strategyproof, but the property is never exactly verified leaving potential loopholes for market participants to exploit.
no code implementations • 30 May 2020 • Zheng Xu, Ali Shafahi, Tom Goldstein
Our adaptive networks also outperform larger widened non-adaptive architectures that have 1. 5 times more parameters.
no code implementations • 20 Apr 2020 • Ahmed Abdelkader, Michael J. Curry, Liam Fowl, Tom Goldstein, Avi Schwarzschild, Manli Shu, Christoph Studer, Chen Zhu
We first demonstrate successful transfer attacks against a victim network using \textit{only} its feature extractor.
2 code implementations • NeurIPS 2020 • W. Ronny Huang, Jonas Geiping, Liam Fowl, Gavin Taylor, Tom Goldstein
Existing attacks for data poisoning neural networks have relied on hand-crafted heuristics, because solving the poisoning problem directly via bilevel optimization is generally thought of as intractable for deep models.
1 code implementation • ICLR 2020 • Amin Ghiasi, Ali Shafahi, Tom Goldstein
To deflect adversarial attacks, a range of "certified" classifiers have been proposed.
1 code implementation • ICLR 2020 • Ping-Yeh Chiang, Renkun Ni, Ahmed Abdelkader, Chen Zhu, Christoph Studer, Tom Goldstein
Adversarial patch attacks are among one of the most practical threat models against real-world computer vision systems.
no code implementations • 22 Feb 2020 • Chen Zhu, Renkun Ni, Ping-Yeh Chiang, Hengduo Li, Furong Huang, Tom Goldstein
Convex relaxations are effective for training and certifying neural networks against norm-bounded adversarial attacks, but they leave a large gap between certifiable and empirical robustness.
no code implementations • 21 Feb 2020 • Micah Goldblum, Avi Schwarzschild, Ankit B. Patel, Tom Goldstein
Algorithmic trading systems are often completely automated, and deep learning is increasingly receiving attention in this domain.
1 code implementation • ICML 2020 • Micah Goldblum, Steven Reich, Liam Fowl, Renkun Ni, Valeriia Cherepanova, Tom Goldstein
In doing so, we introduce and verify several hypotheses for why meta-learned models perform better.
1 code implementation • ICML 2020 • Aounon Kumar, Alexander Levine, Tom Goldstein, Soheil Feizi
Notably, for $p \geq 2$, this dependence on $d$ is no better than that of the $\ell_p$-radius that can be certified using isotropic Gaussian smoothing, essentially putting a matching lower bound on the robustness radius.
1 code implementation • 28 Jan 2020 • Ramina Ghods, Andrew S. Lan, Tom Goldstein, Christoph Studer
To address this issue, a variety of methods that rely on random parameter initialization or knowledge distillation have been proposed in the past.
no code implementations • 18 Nov 2019 • Ping-Yeh Chiang, Jonas Geiping, Micah Goldblum, Tom Goldstein, Renkun Ni, Steven Reich, Ali Shafahi
State-of-the-art adversarial attacks on neural networks use expensive iterative methods and numerous random restarts from different initial points.
1 code implementation • ICML 2020 • Chuan Guo, Tom Goldstein, Awni Hannun, Laurens van der Maaten
Good data stewardship requires removal of data at the request of the data's owner.
2 code implementations • ECCV 2020 • Zuxuan Wu, Ser-Nam Lim, Larry Davis, Tom Goldstein
We present a systematic study of adversarial attacks on state-of-the-art object detection frameworks.
no code implementations • 25 Oct 2019 • Ali Shafahi, Amin Ghiasi, Furong Huang, Tom Goldstein
Adversarial training is one of the strongest defenses against adversarial attacks, but it requires adversarial examples to be generated for every mini-batch during optimization.
1 code implementation • 17 Oct 2019 • Yogesh Balaji, Tom Goldstein, Judy Hoffman
Adversarial training is by far the most successful strategy for improving robustness of neural networks to adversarial attacks.
1 code implementation • NeurIPS 2020 • Micah Goldblum, Liam Fowl, Tom Goldstein
Previous work on adversarially robust neural networks for image classification requires large training sets and computationally expensive training procedures.
1 code implementation • ICLR 2020 • Micah Goldblum, Jonas Geiping, Avi Schwarzschild, Michael Moeller, Tom Goldstein
We empirically evaluate common assumptions about neural networks that are widely held by practitioners and theorists alike.
no code implementations • 29 Sep 2019 • Eric Lei, Oscar Castañeda, Olav Tirkkonen, Tom Goldstein, Christoph Studer
In this paper, we propose a unified architecture based on Siamese networks that can be used for supervised UE positioning and unsupervised channel charting.
1 code implementation • 29 Sep 2019 • Neehar Peri, Neal Gupta, W. Ronny Huang, Liam Fowl, Chen Zhu, Soheil Feizi, Tom Goldstein, John P. Dickerson
Targeted clean-label data poisoning is a type of adversarial attack on machine learning systems in which an adversary injects a few correctly-labeled, minimally-perturbed samples into the training data, causing a model to misclassify a particular test sample during inference.