no code implementations • 6 Sep 2024 • Maria-Florina Balcan, Anh Tuan Nguyen, Dravyansh Sharma
We then show that for many parameterized algorithms of interest, their utility function possesses a refined piece-wise structure, which automatically translates to learning guarantees using our proposed framework.
no code implementations • 4 Sep 2024 • Maria-Florina Balcan, Matteo Pozzi, Dravyansh Sharma
Overcoming the impact of selfish behavior of rational players in multiagent systems is a fundamental problem in game theory.
no code implementations • 13 Feb 2024 • Keegan Harris, Zhiwei Steven Wu, Maria-Florina Balcan
In sharp contrast to the non-contextual version, we show that it is impossible for the leader to achieve good performance (measured by regret) in the full adversarial setting.
no code implementations • 1 Feb 2024 • Runtian Zhai, Rattana Pukdee, Roger Jin, Maria-Florina Balcan, Pradeep Ravikumar
Unlabeled data is a key component of modern machine learning.
no code implementations • 3 Oct 2023 • Mikhail Khodak, Edmond Chow, Maria-Florina Balcan, Ameet Talwalkar
For this method, we prove that a bandit online learning algorithm--using only the number of iterations as feedback--can select parameters for a sequence of instances such that the overall cost approaches that of the best fixed $\omega$ as the sequence length increases.
no code implementations • NeurIPS 2023 • Rattana Pukdee, Dylan Sam, J. Zico Kolter, Maria-Florina Balcan, Pradeep Ravikumar
In this paper, we formalize this notion as learning from explanation constraints and provide a learning theoretic framework to analyze how such explanations can improve the learning of our models.
no code implementations • 22 Feb 2023 • Maria-Florina Balcan, Hedyeh Beyhaghi
We provide the first online learning algorithms for menus of lotteries and two-part tariffs with strong regret-bound guarantees.
no code implementations • 23 Oct 2022 • Maria-Florina Balcan, Rattana Pukdee, Pradeep Ravikumar, Hongyang Zhang
Adversarial training is a standard technique for training adversarially robust models.
1 code implementation • 7 Oct 2022 • Rattana Pukdee, Dylan Sam, Maria-Florina Balcan, Pradeep Ravikumar
Semi-supervised learning and weakly supervised learning are important paradigms that aim to reduce the growing demand for labeled data in current machine learning applications.
no code implementations • 20 Jul 2022 • Maria-Florina Balcan, Mikhail Khodak, Dravyansh Sharma, Ameet Talwalkar
We consider the problem of tuning the regularization parameters of Ridge regression, LASSO, and the ElasticNet across multiple problem instances, a setting that encompasses both cross-validation and multi-task hyperparameter optimization.
no code implementations • 27 May 2022 • Maria-Florina Balcan, Keegan Harris, Mikhail Khodak, Zhiwei Steven Wu
We study online learning with bandit feedback across multiple tasks, with the goal of improving average performance across tasks if they are similar according to some natural task-similarity measure.
no code implementations • 15 Apr 2022 • Maria-Florina Balcan, Siddharth Prasad, Tuomas Sandholm, Ellen Vitercik
These guarantees apply to infinite families of cutting planes, such as the family of Gomory mixed integer cuts, which are responsible for the main breakthrough speedups of integer programming solvers.
no code implementations • 7 Apr 2022 • Maria-Florina Balcan, Christopher Seiler, Dravyansh Sharma
Data-driven algorithm design is a promising, learning-based approach for beyond worst-case analysis of algorithms with tunable parameters.
no code implementations • 8 Mar 2022 • Maria-Florina Balcan, Avrim Blum, Steve Hanneke, Dravyansh Sharma
Remarkably, we provide a complete characterization of learnability in this setting, in particular, nearly-tight matching upper and lower bounds on the region that can be certified, as well as efficient algorithms for computing this region given an ERM oracle.
no code implementations • 18 Feb 2022 • Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar, Sergei Vassilvitskii
A burgeoning paradigm in algorithm design is the field of algorithms with predictions, in which algorithms can take advantage of a possibly-imperfect prediction of some aspect of the problem.
no code implementations • 18 Nov 2021 • Maria-Florina Balcan, Siddharth Prasad, Tuomas Sandholm, Ellen Vitercik
If the training set is too small, a configuration may have good performance over the training set but poor performance on future integer programs.
no code implementations • NeurIPS 2021 • Maria-Florina Balcan, Mikhail Khodak, Dravyansh Sharma, Ameet Talwalkar
We analyze the meta-learning of the initialization and step-size of learning algorithms for piecewise-Lipschitz functions, a non-convex setting with applications to both machine learning and algorithms.
no code implementations • NeurIPS 2021 • Mikhail Khodak, Renbo Tu, Tian Li, Liam Li, Maria-Florina Balcan, Virginia Smith, Ameet Talwalkar
Tuning hyperparameters is a crucial but arduous part of the machine learning pipeline.
no code implementations • NeurIPS 2021 • Maria-Florina Balcan, Siddharth Prasad, Tuomas Sandholm, Ellen Vitercik
We first bound the sample complexity of learning cutting planes from the canonical family of Chv\'atal-Gomory cuts.
no code implementations • NeurIPS 2021 • Maria-Florina Balcan, Dravyansh Sharma
Over the past decades, several elegant graph-based semi-supervised learning algorithms for how to infer the labels of the unlabeled examples given the graph and a few labeled examples have been proposed.
no code implementations • 24 Dec 2020 • Maria-Florina Balcan, Tuomas Sandholm, Ellen Vitercik
This algorithm configuration procedure works by first selecting a portfolio of diverse algorithm parameter settings, and then, on a given problem instance, using an algorithm selector to choose a parameter setting from the portfolio with strong predicted performance.
1 code implementation • 19 Dec 2020 • Kaiwen Wang, Travis Dick, Maria-Florina Balcan
We provide the first utility guarantees for differentially private top-down decision tree learning in both the single machine and distributed settings.
no code implementations • 14 Nov 2020 • Maria-Florina Balcan
Data driven algorithm design is an important aspect of modern data science and algorithm design.
1 code implementation • 13 Oct 2020 • Maria-Florina Balcan, Avrim Blum, Dravyansh Sharma, Hongyang Zhang
Despite significant advances, deep networks remain highly susceptible to adversarial attack.
no code implementations • 10 Oct 2020 • Maria-Florina Balcan, Nika Haghtalab
This chapter considers the computational and statistical aspects of learning linear thresholds in presence of noise.
no code implementations • ICML 2020 • Maria-Florina Balcan, Tuomas Sandholm, Ellen Vitercik
We answer this question for algorithm configuration problems that exhibit a widely-applicable structure: the algorithm's performance as a function of its parameters can be approximated by a "simple" function.
1 code implementation • ICLR 2021 • Liam Li, Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar
Recent state-of-the-art methods for neural architecture search (NAS) exploit gradient-based optimization by relaxing the problem into continuous optimization over architectures and shared-weights, a noisy process that remains poorly understood.
no code implementations • 25 Sep 2019 • Mikhail Khodak, Liam Li, Maria-Florina Balcan, Ameet Talwalkar
Weight-sharing—the simultaneous optimization of multiple neural networks using the same parameters—has emerged as a key component of state-of-the-art neural architecture search.
no code implementations • 8 Aug 2019 • Maria-Florina Balcan, Dan DeBlasio, Travis Dick, Carl Kingsford, Tuomas Sandholm, Ellen Vitercik
We provide a broadly applicable theory for deriving generalization guarantees that bound the difference between the algorithm's average performance over the training set and its expected performance.
no code implementations • 22 Jul 2019 • Maria-Florina Balcan, Travis Dick, Dravyansh Sharma
We consider the class of piecewise Lipschitz functions, which is the most general online setting considered in the literature for the problem, and arises naturally in various combinatorial algorithm selection problems where utility functions can have sharp discontinuities.
no code implementations • ICLR 2020 • Maria-Florina Balcan, Travis Dick, Manuel Lang
Clustering is an important part of many modern data analysis pipelines, including network analysis and data retrieval.
1 code implementation • NeurIPS 2019 • Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar
We build a theoretical framework for designing and understanding practical meta-learning methods that integrates sophisticated formalizations of task-similarity with the extensive literature on online convex optimization and sequential prediction algorithms.
no code implementations • 26 May 2019 • Maria-Florina Balcan, Tuomas Sandholm, Ellen Vitercik
Our algorithm can help compile a configuration portfolio, or it can be used to select the input to a configuration algorithm for finite parameter spaces.
no code implementations • 18 Apr 2019 • Maria-Florina Balcan, Travis Dick, Wesley Pegden
We apply our semi-bandit results to obtain the first provable guarantees for data-driven algorithm design for linkage-based clustering and we improve the best regret bounds for designing greedy knapsack algorithms.
1 code implementation • 27 Feb 2019 • Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar
We study the problem of meta-learning through the lens of online convex optimization, developing a meta-algorithm bridging the gap between popular gradient-based meta-learning and classical regularization-based multi-task transfer methods.
no code implementations • 18 Oct 2018 • Maria-Florina Balcan, Yi Li, David P. Woodruff, Hongyang Zhang
This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix.
no code implementations • NeurIPS 2019 • Maria-Florina Balcan, Travis Dick, Ritesh Noothigattu, Ariel D. Procaccia
In classic fair division problems such as cake cutting and rent division, envy-freeness requires that each individual (weakly) prefer his allocation to anyone else's.
no code implementations • NeurIPS 2018 • Maria-Florina Balcan, Travis Dick, Colin White
Algorithms for clustering points in metric spaces is a long-studied area of research.
no code implementations • ICML 2018 • Maria-Florina Balcan, Travis Dick, Tuomas Sandholm, Ellen Vitercik
Tree search algorithms recursively partition the search space to find an optimal solution.
no code implementations • 8 Nov 2017 • Maria-Florina Balcan, Travis Dick, Ellen Vitercik
We present general techniques for online and private optimization of the sum of dispersed piecewise Lipschitz functions.
no code implementations • ICML 2017 • Daniel McNamara, Maria-Florina Balcan
If the representation learned from the source task is fixed, we identify conditions on how the tasks relate to obtain an upper bound on target task risk via a VC dimension-based argument.
no code implementations • ICML 2017 • Maria-Florina Balcan, Travis Dick, YIngyu Liang, Wenlong Mou, Hongyang Zhang
We study the problem of clustering sensitive data while preserving the privacy of individuals represented in the dataset, which has broad applications in practical machine learning and data analysis tasks.
no code implementations • 30 Jun 2017 • Maria-Florina Balcan, Avrim Blum, Vaishnavh Nagarajan
An important long-term goal in machine learning systems is to build learning agents that, like humans, can learn many tasks over their lifetime, and moreover use information from these tasks to improve their ability to do so efficiently.
no code implementations • 19 May 2017 • Maria-Florina Balcan, Colin White
The typical idea is to design a clustering algorithm that outputs a near-optimal solution, provided the data satisfy a natural stability notion.
no code implementations • 29 Apr 2017 • Maria-Florina Balcan, Tuomas Sandholm, Ellen Vitercik
We study multi-item profit maximization when there is an underlying distribution over buyers' values.
no code implementations • 27 Apr 2017 • Maria-Florina Balcan, YIngyu Liang, David P. Woodruff, Hongyang Zhang
This work studies the strong duality of non-convex matrix factorization problems: we show that under certain dual conditions, these problems and its dual have the same optimum.
no code implementations • NeurIPS 2017 • Maria-Florina Balcan, Hongyang Zhang
In this work, we introduce new convex geometry tools to study the properties of $s$-concave distributions and use these properties to provide bounds on quantities of interest to learning including the probability of disagreement between two halfspaces, disagreement outside a band, and the disagreement coefficient.
no code implementations • 2 Mar 2017 • Pranjal Awasthi, Ainesh Bakshi, Maria-Florina Balcan, Colin White, David Woodruff
In this work, we study the $k$-median and $k$-means clustering problems when the data is distributed across many servers and can contain outliers.
no code implementations • 8 Dec 2016 • Nan Du, YIngyu Liang, Maria-Florina Balcan, Manuel Gomez-Rodriguez, Hongyuan Zha, Le Song
A typical viral marketing model identifies influential users in a social network to maximize a single product adoption assuming unlimited user attention, campaign budgets, and time.
no code implementations • NeurIPS 2016 • Maria-Florina Balcan, Hongyang Zhang
For this problem, we present an algorithm that returns a matrix of a small error, with sample complexity almost as small as the best prior results in the noiseless case.
no code implementations • 14 Nov 2016 • Maria-Florina Balcan, Vaishnavh Nagarajan, Ellen Vitercik, Colin White
We address this problem for clustering, max-cut, and other partitioning problems, such as integer quadratic programming, by designing computationally efficient and sample efficient learning algorithms which receive samples from an application-specific distribution over problem instances and learn a partitioning algorithm with high expected performance.
no code implementations • NeurIPS 2016 • Maria-Florina Balcan, Tuomas Sandholm, Ellen Vitercik
In the traditional economic models, it is assumed that the bidders' valuations are drawn from an underlying distribution and that the auction designer has perfect knowledge of this distribution.
no code implementations • 30 May 2016 • Maria-Florina Balcan, Ellen Vitercik, Colin White
However, for real-valued functions, cardinal labels might not be accessible, or it may be difficult for an expert to consistently assign real-valued labels over the entire set of examples.
no code implementations • 1 Feb 2016 • Gautam Dasarathy, Aarti Singh, Maria-Florina Balcan, Jong Hyuk Park
The problem of learning the structure of a high dimensional graphical model from data has received considerable attention in recent years.
no code implementations • 21 Jun 2015 • Shang-Tse Chen, Maria-Florina Balcan, Duen Horng Chau
We consider the problem of learning from distributed data in the agnostic setting, i. e., in the presence of arbitrary forms of noise.
no code implementations • 14 May 2015 • Maria-Florina Balcan, Nika Haghtalab, Colin White
In this work, we take this approach and provide strong positive results both for the asymmetric and symmetric $k$-center problems under a natural input stability (promise) condition called $\alpha$-perturbation resilience [Bilu and Linia 2012], which states that the optimal solution does not change under any alpha-factor perturbation to the input distances.
1 code implementation • 30 Apr 2015 • Maria-Florina Balcan, Ariel D. Procaccia, Yair Zick
This paper explores a PAC (probably approximately correct) learning model in cooperative games.
Computer Science and Game Theory
no code implementations • 23 Mar 2015 • Maria-Florina Balcan, YIngyu Liang, Le Song, David Woodruff, Bo Xie
Can we perform kernel PCA on the entire dataset in a distributed and communication efficient fashion while maintaining provable and strong guarantees in solution quality?
no code implementations • 12 Mar 2015 • Pranjal Awasthi, Maria-Florina Balcan, Nika Haghtalab, Ruth Urner
We provide the first polynomial time algorithm that can learn linear separators to arbitrarily small excess error in this noise model under the uniform distribution over the unit ball in $\Re^d$, for some constant value of $\eta$.
no code implementations • 6 Nov 2014 • Maria-Florina Balcan, Avrim Blum, Santosh Vempala
Specifically, we consider the problem of learning many different target functions over time, that share certain commonalities that are initially unknown to the learning algorithm.
no code implementations • NeurIPS 2014 • Maria-Florina Balcan, Vandana Kanchanapally, YIngyu Liang, David Woodruff
We give new algorithms and analyses for distributed PCA which lead to improved communication and computational costs for $k$-means clustering and related problems.
no code implementations • 9 Aug 2014 • Konstantin Voevodski, Maria-Florina Balcan, Heiko Roglin, Shang-Hua Teng, Yu Xia
Given a point set S and an unknown metric d on S, we study the problem of efficiently partitioning S into k clusters while querying few distances between the points.
no code implementations • 30 Jul 2014 • Maria-Florina Balcan, Amit Daniely, Ruta Mehta, Ruth Urner, Vijay V. Vazirani
In this work we advance this line of work by providing sample complexity guarantees and efficient algorithms for a number of important classes.
1 code implementation • NeurIPS 2014 • Bo Dai, Bo Xie, Niao He, YIngyu Liang, Anant Raj, Maria-Florina Balcan, Le Song
The general perception is that kernel methods are not scalable, and neural nets are the methods of choice for nonlinear learning problems.
no code implementations • NeurIPS 2014 • Maria-Florina Balcan, Chris Berlind, Avrim Blum, Emma Cohen, Kaushik Patnaik, Le Song
We examine an important setting for engineered systems in which low-power distributed sensors are each making highly noisy measurements of some unknown target function.
no code implementations • 9 Apr 2014 • Aurélien Bellet, YIngyu Liang, Alireza Bagheri Garakani, Maria-Florina Balcan, Fei Sha
We further show that the communication cost of dFW is optimal by deriving a lower-bound on the communication cost required to construct an $\epsilon$-approximate solution.
no code implementations • 1 Jan 2014 • Maria-Florina Balcan, YIngyu Liang, Pramod Gupta
One of the most widely used techniques for data clustering is agglomerative clustering.
no code implementations • 24 Dec 2013 • Pranjal Awasthi, Maria-Florina Balcan, Konstantin Voevodski
We study the design of interactive clustering algorithms for data sets satisfying natural stability assumptions.