Search Results for author: Baharan Mirzasoleiman

Found 45 papers, 12 papers with code

Make the Most of Your Data: Changing the Training Data Distribution to Improve In-distribution Generalization Performance

no code implementations • 27 Apr 2024 • Dang Nguyen, Paymon Haddad, Eric Gan, Baharan Mirzasoleiman

Can we modify the training data distribution to encourage the underlying optimization method toward finding solutions with superior generalization performance on in-distribution data?

Data Augmentation Inductive Bias

Paper
Add Code

Data-Efficient Contrastive Language-Image Pretraining: Prioritizing Data Quality over Quantity

1 code implementation • 18 Mar 2024 • Siddharth Joshi, Arnav Jain, Ali Payani, Baharan Mirzasoleiman

We show that subsets that closely preserve the cross-covariance of the images and captions of the full data provably achieve a superior generalization performance.

Zero-shot Generalization

Paper
Code

Investigating the Benefits of Projection Head for Representation Learning

no code implementations • 18 Mar 2024 • Yihao Xue, Eric Gan, Jiayi Ni, Siddharth Joshi, Baharan Mirzasoleiman

An effective technique for obtaining high-quality representations is adding a projection head on top of the encoder during training, then discarding it and using the pre-projection representations.

Contrastive Learning Data Augmentation +1

Paper
Add Code

SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models

no code implementations • 12 Mar 2024 • Yu Yang, Siddhartha Mishra, Jeffrey N Chiang, Baharan Mirzasoleiman

In clinical text summarization on the MIMIC-III dataset (Johnson et al., 2016), S2L again outperforms training on the full dataset using only 50% of the data.

Math Text Summarization

Paper
Add Code

Inference and Interference: The Role of Clipping, Pruning and Loss Landscapes in Differentially Private Stochastic Gradient Descent

no code implementations • 12 Nov 2023 • Lauren Watson, Eric Gan, Mohan Dantam, Baharan Mirzasoleiman, Rik Sarkar

Differentially private stochastic gradient descent (DP-SGD) is known to have poorer training and test performance on large neural networks, compared to ordinary stochastic gradient descent (SGD).

Dimensionality Reduction

Paper
Add Code

Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality

no code implementations • 10 Oct 2023 • Xuxi Chen, Yu Yang, Zhangyang Wang, Baharan Mirzasoleiman

Dataset distillation aims to minimize the time and memory needed for training deep networks on large datasets, by creating a small set of synthetic images that has a similar generalization performance to that of the full dataset.

Paper
Add Code

Understanding the Robustness of Multi-modal Contrastive Learning to Distribution Shift

no code implementations • 8 Oct 2023 • Yihao Xue, Siddharth Joshi, Dang Nguyen, Baharan Mirzasoleiman

Recently, multimodal contrastive learning (MMCL) approaches, such as CLIP, have achieved a remarkable success in learning representations that are robust against distribution shift and generalize to new domains.

Contrastive Learning Zero-Shot Learning

Paper
Add Code

Better Safe than Sorry: Pre-training CLIP against Targeted Data Poisoning and Backdoor Attacks

no code implementations • 5 Oct 2023 • Wenhan Yang, Jingdong Gao, Baharan Mirzasoleiman

SAFECLIP trains on the risky data by applying unimodal CL to image and text modalities separately, and trains on the safe data using the CLIP loss.

Contrastive Learning Data Poisoning +1

Paper
Add Code

Ordering for Non-Replacement SGD

no code implementations • 28 Jun 2023 • Yuetong Xu, Baharan Mirzasoleiman

One approach for reducing run time and improving efficiency of machine learning is to reduce the convergence rate of the optimization algorithm used.

Paper
Add Code

Towards Mitigating Spurious Correlations in the Wild: A Benchmark and a more Realistic Dataset

1 code implementation • 21 Jun 2023 • Siddharth Joshi, Yu Yang, Yihao Xue, Wenhan Yang, Baharan Mirzasoleiman

Deep neural networks often exploit non-predictive features that are spuriously correlated with class labels, leading to poor performance on groups of examples without such features.

Paper
Code

Towards Sustainable Learning: Coresets for Data-efficient Deep Learning

1 code implementation • 2 Jun 2023 • Yu Yang, Hao Kang, Baharan Mirzasoleiman

To improve the efficiency and sustainability of learning deep models, we propose CREST, the first scalable framework with rigorous theoretical guarantees to identify the most valuable examples for training non-convex models, particularly deep networks.

Paper
Code

Identifying Spurious Biases Early in Training through the Lens of Simplicity Bias

no code implementations • 30 May 2023 • Yu Yang, Eric Gan, Gintare Karolina Dziugaite, Baharan Mirzasoleiman

In this work, we provide the first theoretical analysis of the effect of simplicity bias on learning spurious correlations.

Inductive Bias

Paper
Add Code

Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression

no code implementations • 25 May 2023 • Yihao Xue, Siddharth Joshi, Eric Gan, Pin-Yu Chen, Baharan Mirzasoleiman

However, supervised CL is prone to collapsing representations of subclasses within a class by not capturing all their features, and unsupervised CL may suppress harder class-relevant features by focusing on learning easy class-irrelevant features; both significantly compromise representation quality.

Contrastive Learning Representation Learning

Paper
Add Code

Few-shot Adaption to Distribution Shifts By Mixing Source and Target Embeddings

no code implementations • 23 May 2023 • Yihao Xue, Ali Payani, Yu Yang, Baharan Mirzasoleiman

Pretrained machine learning models need to be adapted to distribution shifts when deployed in new target environments.

Paper
Add Code

Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning

no code implementations • 8 Apr 2023 • Yu Yang, Besmira Nushi, Hamid Palangi, Baharan Mirzasoleiman

Spurious correlations that degrade model generalization or lead the model to be right for the wrong reasons are one of the main robustness concerns for real-world deployments.

Attribute

Paper
Add Code

A Self-supervised Framework for Improved Data-Driven Monitoring of Stress via Multi-modal Passive Sensing

no code implementations • 24 Mar 2023 • Shayan Fazeli, Lionel Levine, Mehrab Beikzadeh, Baharan Mirzasoleiman, Bita Zadeh, Tara Peris, Majid Sarrafzadeh

Recent advances in remote health monitoring systems have significantly benefited patients and played a crucial role in improving their quality of life.

Paper
Add Code

High Probability Bounds for Stochastic Continuous Submodular Maximization

no code implementations • 20 Mar 2023 • Evan Becker, Jingdong Gao, Ted Zadouri, Baharan Mirzasoleiman

This implies that for a particular run of the algorithms, the solution may be much worse than the provided guarantee in expectation.

Vocal Bursts Intensity Prediction

Paper
Add Code

Robust Contrastive Language-Image Pre-training against Data Poisoning and Backdoor Attacks

1 code implementation • 13 Mar 2023 • Wenhan Yang, Jingdong Gao, Baharan Mirzasoleiman

In particular, ROCLIP decreases the success rate for targeted data poisoning attacks from 93. 75% to 12. 5% and that of backdoor attacks down to 0%, while improving the model's linear probe performance by 10% and maintains a similar zero shot performance compared to CLIP.

Backdoor Attack Data Poisoning +2

Paper
Code

Contrastive Learning under Heterophily

no code implementations • 11 Mar 2023 • Wenhan Yang, Baharan Mirzasoleiman

Effectively, the high-pass filter captures the dissimilarity between nodes in a neighborhood and the low-pass filter captures the similarity between neighboring nodes. Contrasting the two filtered views allows HLCL to learn rich node representations for graphs, under heterophily and homophily. Empirically, HLCL outperforms state-of-the-art graph CL methods on benchmark heterophily datasets and large-scale real-world datasets by up to 10%.

Contrastive Learning

Paper
Add Code

Data-Efficient Contrastive Self-supervised Learning: Most Beneficial Examples for Supervised Learning Contribute the Least

2 code implementations • 18 Feb 2023 • Siddharth Joshi, Baharan Mirzasoleiman

In this work, we address this problem for the first time, by proving that examples that contribute the most to contrastive SSL are those that have the most similar augmentations to other examples, in expectation.

Contrastive Learning Open-Ended Question Answering +1

Paper
Code

Generating High Fidelity Synthetic Data via Coreset selection and Entropic Regularization

no code implementations • 31 Jan 2023 • Omead Pooladzandi, Pasha Khosravi, Erik Nijkamp, Baharan Mirzasoleiman

Generative models have the ability to synthesize data points drawn from the data distribution, however, not all generated samples are high quality.

Vocal Bursts Intensity Prediction

Paper
Add Code

Not All Poisons are Created Equal: Robust Training against Data Poisoning

2 code implementations • 18 Oct 2022 • Yu Yang, Tian Yu Liu, Baharan Mirzasoleiman

Data poisoning causes misclassification of test time target examples by injecting maliciously crafted samples in the training data.

Data Poisoning

Paper
Code

Data-Efficient Augmentation for Training Neural Networks

1 code implementation • 15 Oct 2022 • Tian Yu Liu, Baharan Mirzasoleiman

To address this, we propose a rigorous technique to select subsets of data points that when augmented, closely capture the training dynamics of full data augmentation.

Data Augmentation

Paper
Code

Investigating the Impact of Model Width and Density on Generalization in Presence of Label Noise

no code implementations • 17 Aug 2022 • Yihao Xue, Kyle Whitecross, Baharan Mirzasoleiman

However, the effect of label noise on the test loss curve has not been fully explored.

Attribute

Paper
Add Code

Friendly Noise against Adversarial Noise: A Powerful Defense against Data Poisoning Attacks

1 code implementation • 14 Aug 2022 • Tian Yu Liu, Yu Yang, Baharan Mirzasoleiman

We make the key observation that attacks introduce local sharp regions of high training loss, which when minimized, results in learning the adversarial perturbations and makes the attack successful.

Data Poisoning

Paper
Code

Adaptive Second Order Coresets for Data-efficient Machine Learning

no code implementations • 28 Jul 2022 • Omead Pooladzandi, David Davini, Baharan Mirzasoleiman

We propose AdaCore, a method that leverages the geometry of the data to extract subsets of the training examples for efficient machine learning.

BIG-bench Machine Learning Second-order methods

Paper
Add Code

Investigating Why Contrastive Learning Benefits Robustness Against Label Noise

no code implementations • 29 Jan 2022 • Yihao Xue, Kyle Whitecross, Baharan Mirzasoleiman

Self-supervised Contrastive Learning (CL) has been recently shown to be very effective in preventing deep networks from overfitting noisy labels.

Contrastive Learning

Paper
Add Code

CrossWalk: Fairness-enhanced Node Representation Learning

1 code implementation • 6 May 2021 • Ahmad Khajehnejad, Moein Khajehnejad, Mahmoudreza Babaei, Krishna P. Gummadi, Adrian Weller, Baharan Mirzasoleiman

The potential for machine learning systems to amplify social inequities and unfairness is receiving increasing popular and academic attention.

Fairness Link Prediction +2

Paper
Code

Coresets for Robust Training of Deep Neural Networks against Noisy Labels

no code implementations • NeurIPS 2020 • Baharan Mirzasoleiman, Kaidi Cao, Jure Leskovec

Modern neural networks have the capacity to overfit noisy labels frequently found in real-world datasets.

Paper
Add Code

Coresets for Robust Training of Neural Networks against Noisy Labels

no code implementations • 15 Nov 2020 • Baharan Mirzasoleiman, Kaidi Cao, Jure Leskovec

Modern neural networks have the capacity to overfit noisy labels frequently found in real-world datasets.

Ranked #36 on Image Classification on mini WebVision 1.0

Image Classification

Paper
Add Code

Coresets for Accelerating Incremental Gradient Methods

no code implementations • 25 Sep 2019 • Baharan Mirzasoleiman, Jeff Bilmes, Jure Leskovec

But because at each epoch the gradients are computed only on the subset S, we obtain a speedup that is inversely proportional to the size of S. Our subset selection algorithm is fully general and can be applied to most IG methods.

Open-Ended Question Answering

Paper
Add Code

Selection via Proxy: Efficient Data Selection for Deep Learning

1 code implementation • ICLR 2020 • Cody Coleman, Christopher Yeh, Stephen Mussmann, Baharan Mirzasoleiman, Peter Bailis, Percy Liang, Jure Leskovec, Matei Zaharia

By removing hidden layers from the target model, using smaller architectures, and training for fewer epochs, we create proxies that are an order of magnitude faster to train.

Active Learning Computational Efficiency

Paper
Code

Coresets for Data-efficient Training of Machine Learning Models

3 code implementations • ICML 2020 • Baharan Mirzasoleiman, Jeff Bilmes, Jure Leskovec

Here we develop CRAIG, a method to select a weighted subset (or coreset) of training data that closely estimates the full gradient by maximizing a submodular function.

BIG-bench Machine Learning Open-Ended Question Answering

309

Paper
Code

Coresets for Estimating Means and Mean Square Error with Limited Greedy Samples

no code implementations • 3 Jun 2019 • Saeed Vahidian, Baharan Mirzasoleiman, Alexander Cloninger

In a number of situations, collecting a function value for every data point may be prohibitively expensive, and random sampling ignores any structure in the underlying data.

Clustering Node Classification

Paper
Add Code

On the Fairness of Time-Critical Influence Maximization in Social Networks

no code implementations • 16 May 2019 • Junaid Ali, Mahmoudreza Babaei, Abhijnan Chakraborty, Baharan Mirzasoleiman, Krishna P. Gummadi, Adish Singla

As we show in this paper, the time-criticality of the information could further exacerbate the disparity of influence across groups.

Social and Information Networks Computers and Society

Paper
Add Code

Select Via Proxy: Efficient Data Selection For Training Deep Networks

no code implementations • ICLR 2019 • Cody Coleman, Stephen Mussmann, Baharan Mirzasoleiman, Peter Bailis, Percy Liang, Jure Leskovec, Matei Zaharia

In our approach, we first train a small proxy model quickly, which we then use to estimate the utility of individual training data points, and then select the most informative ones for training the large target model.

BIG-bench Machine Learning Image Classification +1

Paper
Add Code

Dynamic Network Model from Partial Observations

no code implementations • NeurIPS 2018 • Elahe Ghalebi, Baharan Mirzasoleiman, Radu Grosu, Jure Leskovec

We propose a novel framework for providing a non-parametric dynamic network model--based on a mixture of coupled hierarchical Dirichlet processes-- based on data capturing cascade node infection times.

Open-Ended Question Answering

Paper
Add Code

Deletion-Robust Submodular Maximization: Data Summarization with "the Right to be Forgotten"

no code implementations • ICML 2017 • Baharan Mirzasoleiman, Amin Karbasi, Andreas Krause

How can we summarize a dynamic data stream when elements selected for the summary can be deleted at any time?

Data Summarization News Recommendation

Paper
Add Code

Streaming Non-monotone Submodular Maximization: Personalized Video Summarization on the Fly

1 code implementation • 12 Jun 2017 • Baharan Mirzasoleiman, Stefanie Jegelka, Andreas Krause

The need for real time analysis of rapidly producing data streams (e. g., video and image streams) motivated the design of streaming algorithms that can efficiently extract and summarize useful information from massive data "on the fly".

Data Structures and Algorithms Information Retrieval

Paper
Code

Fast Distributed Submodular Cover: Public-Private Data Summarization

no code implementations • NeurIPS 2016 • Baharan Mirzasoleiman, Morteza Zadimoghaddam, Amin Karbasi

The goal is to provide a succinct summary of massive dataset, ideally as small as possible, from which customized summaries can be built for each user, i. e. it can contain elements from the public data (for diversity) and users' private data (for personalization).

Data Summarization Movie Recommendation +1

Paper
Add Code

Guaranteed Non-convex Optimization: Submodular Maximization over Continuous Domains

no code implementations • 17 Jun 2016 • Andrew An Bian, Baharan Mirzasoleiman, Joachim M. Buhmann, Andreas Krause

Submodular continuous functions are a category of (generally) non-convex/non-concave functions with a wide spectrum of applications.

Data Summarization energy management +1

Paper
Add Code

Distributed Submodular Cover: Succinctly Summarizing Massive Data

no code implementations • NeurIPS 2015 • Baharan Mirzasoleiman, Amin Karbasi, Ashwinkumar Badanidiyuru, Andreas Krause

In this paper, we formalize this challenge as a submodular cover problem.

Clustering Data Summarization

Paper
Add Code

Distributed Submodular Maximization

no code implementations • 3 Nov 2014 • Baharan Mirzasoleiman, Amin Karbasi, Rik Sarkar, Andreas Krause

Such problems can often be reduced to maximizing a submodular set function subject to various constraints.

Clustering

Paper
Add Code

Lazier Than Lazy Greedy

no code implementations • 28 Sep 2014 • Baharan Mirzasoleiman, Ashwinkumar Badanidiyuru, Amin Karbasi, Jan Vondrak, Andreas Krause

Is it possible to maximize a monotone submodular function faster than the widely used lazy greedy algorithm (also known as accelerated greedy), both in theory and practice?

Clustering Data Summarization

Paper
Add Code

Distributed Submodular Maximization: Identifying Representative Elements in Massive Data

no code implementations • NeurIPS 2013 • Baharan Mirzasoleiman, Amin Karbasi, Rik Sarkar, Andreas Krause

Such problems can often be reduced to maximizing a submodular set function subject to cardinality constraints.

Clustering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.