no code implementations • 27 Apr 2024 • Dang Nguyen, Paymon Haddad, Eric Gan, Baharan Mirzasoleiman
Can we modify the training data distribution to encourage the underlying optimization method toward finding solutions with superior generalization performance on in-distribution data?
1 code implementation • 18 Mar 2024 • Siddharth Joshi, Arnav Jain, Ali Payani, Baharan Mirzasoleiman
We show that subsets that closely preserve the cross-covariance of the images and captions of the full data provably achieve a superior generalization performance.
no code implementations • 18 Mar 2024 • Yihao Xue, Eric Gan, Jiayi Ni, Siddharth Joshi, Baharan Mirzasoleiman
An effective technique for obtaining high-quality representations is adding a projection head on top of the encoder during training, then discarding it and using the pre-projection representations.
no code implementations • 12 Mar 2024 • Yu Yang, Siddhartha Mishra, Jeffrey N Chiang, Baharan Mirzasoleiman
In clinical text summarization on the MIMIC-III dataset (Johnson et al., 2016), S2L again outperforms training on the full dataset using only 50% of the data.
no code implementations • 12 Nov 2023 • Lauren Watson, Eric Gan, Mohan Dantam, Baharan Mirzasoleiman, Rik Sarkar
Differentially private stochastic gradient descent (DP-SGD) is known to have poorer training and test performance on large neural networks, compared to ordinary stochastic gradient descent (SGD).
no code implementations • 10 Oct 2023 • Xuxi Chen, Yu Yang, Zhangyang Wang, Baharan Mirzasoleiman
Dataset distillation aims to minimize the time and memory needed for training deep networks on large datasets, by creating a small set of synthetic images that has a similar generalization performance to that of the full dataset.
no code implementations • 8 Oct 2023 • Yihao Xue, Siddharth Joshi, Dang Nguyen, Baharan Mirzasoleiman
Recently, multimodal contrastive learning (MMCL) approaches, such as CLIP, have achieved a remarkable success in learning representations that are robust against distribution shift and generalize to new domains.
no code implementations • 5 Oct 2023 • Wenhan Yang, Jingdong Gao, Baharan Mirzasoleiman
SAFECLIP trains on the risky data by applying unimodal CL to image and text modalities separately, and trains on the safe data using the CLIP loss.
no code implementations • 28 Jun 2023 • Yuetong Xu, Baharan Mirzasoleiman
One approach for reducing run time and improving efficiency of machine learning is to reduce the convergence rate of the optimization algorithm used.
1 code implementation • 21 Jun 2023 • Siddharth Joshi, Yu Yang, Yihao Xue, Wenhan Yang, Baharan Mirzasoleiman
Deep neural networks often exploit non-predictive features that are spuriously correlated with class labels, leading to poor performance on groups of examples without such features.
1 code implementation • 2 Jun 2023 • Yu Yang, Hao Kang, Baharan Mirzasoleiman
To improve the efficiency and sustainability of learning deep models, we propose CREST, the first scalable framework with rigorous theoretical guarantees to identify the most valuable examples for training non-convex models, particularly deep networks.
no code implementations • 30 May 2023 • Yu Yang, Eric Gan, Gintare Karolina Dziugaite, Baharan Mirzasoleiman
In this work, we provide the first theoretical analysis of the effect of simplicity bias on learning spurious correlations.
no code implementations • 25 May 2023 • Yihao Xue, Siddharth Joshi, Eric Gan, Pin-Yu Chen, Baharan Mirzasoleiman
However, supervised CL is prone to collapsing representations of subclasses within a class by not capturing all their features, and unsupervised CL may suppress harder class-relevant features by focusing on learning easy class-irrelevant features; both significantly compromise representation quality.
no code implementations • 23 May 2023 • Yihao Xue, Ali Payani, Yu Yang, Baharan Mirzasoleiman
Pretrained machine learning models need to be adapted to distribution shifts when deployed in new target environments.
no code implementations • 8 Apr 2023 • Yu Yang, Besmira Nushi, Hamid Palangi, Baharan Mirzasoleiman
Spurious correlations that degrade model generalization or lead the model to be right for the wrong reasons are one of the main robustness concerns for real-world deployments.
no code implementations • 24 Mar 2023 • Shayan Fazeli, Lionel Levine, Mehrab Beikzadeh, Baharan Mirzasoleiman, Bita Zadeh, Tara Peris, Majid Sarrafzadeh
Recent advances in remote health monitoring systems have significantly benefited patients and played a crucial role in improving their quality of life.
no code implementations • 20 Mar 2023 • Evan Becker, Jingdong Gao, Ted Zadouri, Baharan Mirzasoleiman
This implies that for a particular run of the algorithms, the solution may be much worse than the provided guarantee in expectation.
1 code implementation • 13 Mar 2023 • Wenhan Yang, Jingdong Gao, Baharan Mirzasoleiman
In particular, ROCLIP decreases the success rate for targeted data poisoning attacks from 93. 75% to 12. 5% and that of backdoor attacks down to 0%, while improving the model's linear probe performance by 10% and maintains a similar zero shot performance compared to CLIP.
no code implementations • 11 Mar 2023 • Wenhan Yang, Baharan Mirzasoleiman
Effectively, the high-pass filter captures the dissimilarity between nodes in a neighborhood and the low-pass filter captures the similarity between neighboring nodes. Contrasting the two filtered views allows HLCL to learn rich node representations for graphs, under heterophily and homophily. Empirically, HLCL outperforms state-of-the-art graph CL methods on benchmark heterophily datasets and large-scale real-world datasets by up to 10%.
2 code implementations • 18 Feb 2023 • Siddharth Joshi, Baharan Mirzasoleiman
In this work, we address this problem for the first time, by proving that examples that contribute the most to contrastive SSL are those that have the most similar augmentations to other examples, in expectation.
no code implementations • 31 Jan 2023 • Omead Pooladzandi, Pasha Khosravi, Erik Nijkamp, Baharan Mirzasoleiman
Generative models have the ability to synthesize data points drawn from the data distribution, however, not all generated samples are high quality.
2 code implementations • 18 Oct 2022 • Yu Yang, Tian Yu Liu, Baharan Mirzasoleiman
Data poisoning causes misclassification of test time target examples by injecting maliciously crafted samples in the training data.
1 code implementation • 15 Oct 2022 • Tian Yu Liu, Baharan Mirzasoleiman
To address this, we propose a rigorous technique to select subsets of data points that when augmented, closely capture the training dynamics of full data augmentation.
no code implementations • 17 Aug 2022 • Yihao Xue, Kyle Whitecross, Baharan Mirzasoleiman
However, the effect of label noise on the test loss curve has not been fully explored.
1 code implementation • 14 Aug 2022 • Tian Yu Liu, Yu Yang, Baharan Mirzasoleiman
We make the key observation that attacks introduce local sharp regions of high training loss, which when minimized, results in learning the adversarial perturbations and makes the attack successful.
no code implementations • 28 Jul 2022 • Omead Pooladzandi, David Davini, Baharan Mirzasoleiman
We propose AdaCore, a method that leverages the geometry of the data to extract subsets of the training examples for efficient machine learning.
no code implementations • 29 Jan 2022 • Yihao Xue, Kyle Whitecross, Baharan Mirzasoleiman
Self-supervised Contrastive Learning (CL) has been recently shown to be very effective in preventing deep networks from overfitting noisy labels.
1 code implementation • 6 May 2021 • Ahmad Khajehnejad, Moein Khajehnejad, Mahmoudreza Babaei, Krishna P. Gummadi, Adrian Weller, Baharan Mirzasoleiman
The potential for machine learning systems to amplify social inequities and unfairness is receiving increasing popular and academic attention.
no code implementations • NeurIPS 2020 • Baharan Mirzasoleiman, Kaidi Cao, Jure Leskovec
Modern neural networks have the capacity to overfit noisy labels frequently found in real-world datasets.
no code implementations • 15 Nov 2020 • Baharan Mirzasoleiman, Kaidi Cao, Jure Leskovec
Modern neural networks have the capacity to overfit noisy labels frequently found in real-world datasets.
Ranked #36 on Image Classification on mini WebVision 1.0
no code implementations • 25 Sep 2019 • Baharan Mirzasoleiman, Jeff Bilmes, Jure Leskovec
But because at each epoch the gradients are computed only on the subset S, we obtain a speedup that is inversely proportional to the size of S. Our subset selection algorithm is fully general and can be applied to most IG methods.
1 code implementation • ICLR 2020 • Cody Coleman, Christopher Yeh, Stephen Mussmann, Baharan Mirzasoleiman, Peter Bailis, Percy Liang, Jure Leskovec, Matei Zaharia
By removing hidden layers from the target model, using smaller architectures, and training for fewer epochs, we create proxies that are an order of magnitude faster to train.
3 code implementations • ICML 2020 • Baharan Mirzasoleiman, Jeff Bilmes, Jure Leskovec
Here we develop CRAIG, a method to select a weighted subset (or coreset) of training data that closely estimates the full gradient by maximizing a submodular function.
no code implementations • 3 Jun 2019 • Saeed Vahidian, Baharan Mirzasoleiman, Alexander Cloninger
In a number of situations, collecting a function value for every data point may be prohibitively expensive, and random sampling ignores any structure in the underlying data.
no code implementations • 16 May 2019 • Junaid Ali, Mahmoudreza Babaei, Abhijnan Chakraborty, Baharan Mirzasoleiman, Krishna P. Gummadi, Adish Singla
As we show in this paper, the time-criticality of the information could further exacerbate the disparity of influence across groups.
Social and Information Networks Computers and Society
no code implementations • ICLR 2019 • Cody Coleman, Stephen Mussmann, Baharan Mirzasoleiman, Peter Bailis, Percy Liang, Jure Leskovec, Matei Zaharia
In our approach, we first train a small proxy model quickly, which we then use to estimate the utility of individual training data points, and then select the most informative ones for training the large target model.
no code implementations • NeurIPS 2018 • Elahe Ghalebi, Baharan Mirzasoleiman, Radu Grosu, Jure Leskovec
We propose a novel framework for providing a non-parametric dynamic network model--based on a mixture of coupled hierarchical Dirichlet processes-- based on data capturing cascade node infection times.
no code implementations • ICML 2017 • Baharan Mirzasoleiman, Amin Karbasi, Andreas Krause
How can we summarize a dynamic data stream when elements selected for the summary can be deleted at any time?
1 code implementation • 12 Jun 2017 • Baharan Mirzasoleiman, Stefanie Jegelka, Andreas Krause
The need for real time analysis of rapidly producing data streams (e. g., video and image streams) motivated the design of streaming algorithms that can efficiently extract and summarize useful information from massive data "on the fly".
Data Structures and Algorithms Information Retrieval
no code implementations • NeurIPS 2016 • Baharan Mirzasoleiman, Morteza Zadimoghaddam, Amin Karbasi
The goal is to provide a succinct summary of massive dataset, ideally as small as possible, from which customized summaries can be built for each user, i. e. it can contain elements from the public data (for diversity) and users' private data (for personalization).
no code implementations • 17 Jun 2016 • Andrew An Bian, Baharan Mirzasoleiman, Joachim M. Buhmann, Andreas Krause
Submodular continuous functions are a category of (generally) non-convex/non-concave functions with a wide spectrum of applications.
no code implementations • NeurIPS 2015 • Baharan Mirzasoleiman, Amin Karbasi, Ashwinkumar Badanidiyuru, Andreas Krause
In this paper, we formalize this challenge as a submodular cover problem.
no code implementations • 3 Nov 2014 • Baharan Mirzasoleiman, Amin Karbasi, Rik Sarkar, Andreas Krause
Such problems can often be reduced to maximizing a submodular set function subject to various constraints.
no code implementations • 28 Sep 2014 • Baharan Mirzasoleiman, Ashwinkumar Badanidiyuru, Amin Karbasi, Jan Vondrak, Andreas Krause
Is it possible to maximize a monotone submodular function faster than the widely used lazy greedy algorithm (also known as accelerated greedy), both in theory and practice?
no code implementations • NeurIPS 2013 • Baharan Mirzasoleiman, Amin Karbasi, Rik Sarkar, Andreas Krause
Such problems can often be reduced to maximizing a submodular set function subject to cardinality constraints.