no code implementations • ICML 2020 • Tianyi Zhou, Shengjie Wang, Jeff Bilmes
In this paper, we study the dynamics of neural net outputs in SSL and show that selecting and using first the unlabeled samples with more consistent outputs over the course of training (i. e., "time-consistency") can improve the final test accuracy and save computation.
no code implementations • 13 Mar 2024 • Gantavya Bhatt, Arnav Das, Jeff Bilmes
In this paper, we introduce deep submodular peripteral networks (DSPNs), a novel parametric family of submodular functions, and methods for their training using a contrastive-learning inspired GPC-ready strategy to connect and then tackle both of the above challenges.
no code implementations • 6 Mar 2024 • Ziyue Li, Tian Li, Virginia Smith, Jeff Bilmes, Tianyi Zhou
Optimizing the performance of many objectives (instantiated by tasks or clients) jointly with a few Pareto stationary solutions (models) is critical in machine learning.
no code implementations • 25 Nov 2023 • Sahil Verma, Gantavya Bhatt, Avi Schwarzschild, Soumye Singhal, Arnav Mohanty Das, Chirag Shah, John P Dickerson, Jeff Bilmes
In this work, we demonstrate that the efficacy of CleanCLIP in mitigating backdoors is highly dependent on the particular objective used during model pre-training.
no code implementations • 10 May 2023 • Arnav Das, Gantavya Bhatt, Megh Bhalerao, Vianne Gao, Rui Yang, Jeff Bilmes
A major problem with Active Learning (AL) is high training costs since models are typically retrained from scratch after every query round.
no code implementations • 7 Jul 2022 • Adhyyan Narang, Omid Sadeghi, Lillian J Ratliff, Maryam Fazel, Jeff Bilmes
At round $i$, a user with unknown utility $h_q$ arrives; the optimizer selects a new item to add to $S_q$, and receives a noisy marginal gain.
no code implementations • 31 Jan 2022 • Jeff Bilmes
In this manuscript, we offer a gentle review of submodularity and supermodularity and their properties.
Abstractive Text Summarization BIG-bench Machine Learning +1
no code implementations • ICLR 2022 • Ravikumar Balakrishnan, Tian Li, Tianyi Zhou, Nageen Himayat, Virginia Smith, Jeff Bilmes
In every communication round of federated learning, a random subset of clients communicate their model updates back to the server which then aggregates them all.
1 code implementation • 15 May 2021 • Sunil Thulasidasan, Sushil Thapa, Sayera Dhaubhadel, Gopinath Chennupati, Tanmoy Bhattacharya, Jeff Bilmes
In this work we present a simple, but highly effective approach to deal with out-of-distribution detection that uses the principle of abstention: when encountering a sample from an unseen class, the desired behavior is to abstain from predicting.
Ranked #1 on Out-of-Distribution Detection on CIFAR-100 (using extra training data)
no code implementations • 30 Apr 2021 • Suraj Kothawade, Vishal Kaushal, Ganesh Ramakrishnan, Jeff Bilmes, Rishabh Iyer
With the rapid growth of data, it is becoming increasingly difficult to train or improve deep learning models with the right subset of data.
1 code implementation • 27 Feb 2021 • Suraj Kothawade, Vishal Kaushal, Ganesh Ramakrishnan, Jeff Bilmes, Rishabh Iyer
Examples of such problems include: i)targeted learning, where the goal is to find subsets with rare classes or rare attributes on which the model is underperforming, and ii)guided summarization, where data (e. g., image collection, text, document or video) is summarized for quicker human consumption with specific additional user intent.
no code implementations • ICLR 2021 • Tianyi Zhou, Shengjie Wang, Jeff Bilmes
Neural nets training can easily overfit to noisy labels and end with poor generalization performance.
no code implementations • 1 Jan 2021 • Sunil Thulasidasan, Sushil Thapa, Sayera Dhaubhadel, Gopinath Chennupati, Tanmoy Bhattacharya, Jeff Bilmes
In this work we present a simple, but highly effective approach to deal with out-of-distribution detection that uses the principle of abstention: when encountering a sample from an unseen class, the desired behavior is to abstain from predicting.
no code implementations • 12 Oct 2020 • Vishal Kaushal, Suraj Kothawade, Ganesh Ramakrishnan, Jeff Bilmes, Himanshu Asnani, Rishabh Iyer
We study submodular information measures as a rich framework for generic, query-focused, privacy sensitive, and update summarization tasks.
no code implementations • 27 Jun 2020 • Rishabh Iyer, Jeff Bilmes
In this paper, we try to provide a more complete picture of the relationship between submodularity with concavity.
no code implementations • 27 Jun 2020 • Rishabh Iyer, Ninad Khargonkar, Jeff Bilmes, Himanshu Asnani
In this paper, we study combinatorial information measures that generalize independence, (conditional) entropy, (conditional) mutual information, and total correlation defined over sets of (not necessarily random) variables.
no code implementations • 25 Sep 2019 • Baharan Mirzasoleiman, Jeff Bilmes, Jure Leskovec
But because at each epoch the gradients are computed only on the subset S, we obtain a speedup that is inversely proportional to the size of S. Our subset selection algorithm is fully general and can be applied to most IG methods.
3 code implementations • ICML 2020 • Baharan Mirzasoleiman, Jeff Bilmes, Jure Leskovec
Here we develop CRAIG, a method to select a weighted subset (or coreset) of training data that closely estimates the full gradient by maximizing a submodular function.
2 code implementations • NeurIPS 2019 • Sunil Thulasidasan, Gopinath Chennupati, Jeff Bilmes, Tanmoy Bhattacharya, Sarah Michalak
In this work, we discuss a hitherto untouched aspect of mixup training -- the calibration and predictive uncertainty of models trained with mixup.
Ranked #1 on Out-of-Distribution Detection on STL-10
2 code implementations • 27 May 2019 • Sunil Thulasidasan, Tanmoy Bhattacharya, Jeff Bilmes, Gopinath Chennupati, Jamal Mohd-Yusof
In the case of unstructured (arbitrary) label noise, abstention during training enables the DAC to be used as an effective data cleaner by identifying samples that are likely to have label noise.
no code implementations • ICLR 2019 • Shengjie Wang, Tianyi Zhou, Jeff Bilmes
In particular, we study how to attribute a DNN's bias to its input features.
no code implementations • ICLR 2019 • Shengjie Wang, Tianyi Zhou, Jeff Bilmes
In this paper, we discuss three novel observations about dropout to better understand the generalization of DNNs with rectified linear unit (ReLU) activations: 1) dropout is a smoothing technique that encourages each local linear model of a DNN to be trained on data points from nearby regions; 2) a constant dropout rate can result in effective neural-deactivation rates that are significantly different for layers with different fractions of activated neurons; and 3) the rescaling factor of dropout causes an inconsistency to occur between the normalization during training and testing conditions when batch normalization is also used.
no code implementations • 26 Feb 2019 • Rishabh Iyer, Jeff Bilmes
We are motivated by large scale submodular optimization problems, where standard algorithms that treat the submodular functions in the \emph{value oracle model} do not scale.
no code implementations • 26 Feb 2019 • Rishabh Iyer, Jeff Bilmes
In this paper, we investigate a class of submodular problems which in general are very hard.
no code implementations • ICML 2018 • Andrew Cotter, Mahdi Milani Fard, Seungil You, Maya Gupta, Jeff Bilmes
We introduce the problem of grouping a finite ground set into blocks where each block is a subset of the ground set and where: (i) the blocks are individually highly valued by a submodular function (both robustly and in the average case) while satisfying block-specific matroid constraints; and (ii) block scores interact where blocks are jointly scored highly, thus making the blocks mutually non-redundant.
no code implementations • ICML 2018 • Wenruo Bai, Jeff Bilmes
We analyze the performance of the greedy algorithm, and also a discrete semi-gradient based algorithm, for maximizing the sum of a suBmodular and suPermodular (BP) function (both of which are non-negative monotone non-decreasing) under two types of constraints, either a cardinality constraint or $p\geq 1$ matroid independence constraints.
no code implementations • ICLR 2018 • Tianyi Zhou, Jeff Bilmes
We introduce and study minimax curriculum learning (MCL), a new method for adaptively selecting a sequence of training subsets for a succession of stages in machine learning.
no code implementations • 1 Jun 2016 • Tianyi Zhou, Jeff Bilmes
We propose a streaming submodular maximization algorithm "stream clipper" that performs as well as the offline greedy algorithm on document/video summarization in practice.
no code implementations • 1 Jun 2016 • Tianyi Zhou, Hua Ouyang, Yi Chang, Jeff Bilmes, Carlos Guestrin
We propose a new random pruning method (called "submodular sparsification (SS)") to reduce the cost of submodular maximization.
1 code implementation • 2 Feb 2016 • Weiran Wang, Raman Arora, Karen Livescu, Jeff Bilmes
We consider learning representations (features) in the setting in which we have access to multiple unlabeled views of the data for learning while only one view is available for downstream tasks.
no code implementations • NeurIPS 2015 • Jennifer Gillenwater, Rishabh Iyer, Bethany Lusch, Rahul Kidambi, Jeff Bilmes
We show that there is a largely unexplored class of functions (positive polymatroids) that can define proper discrete metrics over pairs of binary vectors and that are fairly tractable to optimize over.
no code implementations • NeurIPS 2015 • Kai Wei, Rishabh Iyer, Shengjie Wang, Wenruo Bai, Jeff Bilmes
While the robust versions have been studied in the theory community, existing work has focused on tight approximation guarantees, and the resultant algorithms are not, in general, scalable to very large real-world applications.
no code implementations • 24 Jun 2015 • Rishabh Iyer, Jeff Bilmes
This manuscript provides a more complete picture on the relationship between submodularity with convexity and concavity, by extending many of the results connecting submodularity with convexity to the concave aspects of submodularity.
Discrete Mathematics Data Structures and Algorithms
no code implementations • NeurIPS 2014 • Tianyi Zhou, Jeff Bilmes, Carlos Guestrin
We reduce a broad class of machine learning problems, usually addressed by EM or sampling, to the problem of finding the $k$ extremal rays spanning the conical hull of a data point set.
no code implementations • 2 Feb 2014 • Stefanie Jegelka, Jeff Bilmes
We study an extension of the classical graph cut problem, wherein we replace the modular (sum of edge weights) cost function by a submodular set function defined over graph edges.
no code implementations • NeurIPS 2013 • Rishabh Iyer, Stefanie Jegelka, Jeff Bilmes
We either use a black-box transformation of the function (for approximation and learning), or a transformation of algorithms to use an appropriate surrogate function (for minimization).
no code implementations • NeurIPS 2013 • Rishabh Iyer, Jeff Bilmes
We are motivated by a number of real-world applications in machine learning including sensor placement and data subset selection, which require maximizing a certain submodular function (like coverage or diversity) while simultaneously minimizing another (like cooperative cost).
no code implementations • 24 Aug 2013 • Rishabh Iyer, Jeff Bilmes
We show how a number of recently used web ranking models are forms of Lovasz-Bregman rank aggregation and also observe that a natural form of Mallow's model using the LB divergence has been used as conditional ranking models for the 'Learning to Rank' problem.
no code implementations • 5 Aug 2013 • Rishabh Iyer, Stefanie Jegelka, Jeff Bilmes
We present a practical and powerful new framework for both unconstrained and constrained submodular function optimization based on discrete semidifferentials (sub- and super-differentials).
no code implementations • 3 Jul 2012 • Rishabh Iyer, Jeff Bilmes
We extend the work of Narasimhan and Bilmes [30] for minimizing set functions representable as a difference between submodular functions.
no code implementations • 13 Jun 2012 • Jeff Bilmes, Andrew Ng
This is the Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, which was held in Montreal, QC, Canada, June 18 - 21 2009.