Search Results for author: Maria-Florina Balcan

Found 66 papers, 8 papers with code

Scalable Kernel Methods via Doubly Stochastic Gradients

1 code implementation NeurIPS 2014 Bo Dai, Bo Xie, Niao He, YIngyu Liang, Anant Raj, Maria-Florina Balcan, Le Song

The general perception is that kernel methods are not scalable, and neural nets are the methods of choice for nonlinear learning problems.

Geometry-Aware Gradient Algorithms for Neural Architecture Search

1 code implementation ICLR 2021 Liam Li, Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar

Recent state-of-the-art methods for neural architecture search (NAS) exploit gradient-based optimization by relaxing the problem into continuous optimization over architectures and shared-weights, a noisy process that remains poorly understood.

Neural Architecture Search

Adaptive Gradient-Based Meta-Learning Methods

1 code implementation NeurIPS 2019 Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar

We build a theoretical framework for designing and understanding practical meta-learning methods that integrates sophisticated formalizations of task-similarity with the extensive literature on online convex optimization and sequential prediction algorithms.

Federated Learning Few-Shot Learning

Provable Guarantees for Gradient-Based Meta-Learning

1 code implementation27 Feb 2019 Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar

We study the problem of meta-learning through the lens of online convex optimization, developing a meta-algorithm bridging the gap between popular gradient-based meta-learning and classical regularization-based multi-task transfer methods.

Generalization Bounds Meta-Learning

Label Propagation with Weak Supervision

1 code implementation7 Oct 2022 Rattana Pukdee, Dylan Sam, Maria-Florina Balcan, Pradeep Ravikumar

Semi-supervised learning and weakly supervised learning are important paradigms that aim to reduce the growing demand for labeled data in current machine learning applications.

Weakly Supervised Classification Weakly-supervised Learning

Scalable and Provably Accurate Algorithms for Differentially Private Distributed Decision Tree Learning

1 code implementation19 Dec 2020 Kaiwen Wang, Travis Dick, Maria-Florina Balcan

We provide the first utility guarantees for differentially private top-down decision tree learning in both the single machine and distributed settings.

Privacy Preserving

Learning to Branch

no code implementations ICML 2018 Maria-Florina Balcan, Travis Dick, Tuomas Sandholm, Ellen Vitercik

Tree search algorithms recursively partition the search space to find an optimal solution.

Variable Selection

Dispersion for Data-Driven Algorithm Design, Online Learning, and Private Optimization

no code implementations8 Nov 2017 Maria-Florina Balcan, Travis Dick, Ellen Vitercik

We present general techniques for online and private optimization of the sum of dispersed piecewise Lipschitz functions.

Matrix Completion and Related Problems via Strong Duality

no code implementations27 Apr 2017 Maria-Florina Balcan, YIngyu Liang, David P. Woodruff, Hongyang Zhang

This work studies the strong duality of non-convex matrix factorization problems: we show that under certain dual conditions, these problems and its dual have the same optimum.

Matrix Completion

Sample and Computationally Efficient Learning Algorithms under S-Concave Distributions

no code implementations NeurIPS 2017 Maria-Florina Balcan, Hongyang Zhang

In this work, we introduce new convex geometry tools to study the properties of $s$-concave distributions and use these properties to provide bounds on quantities of interest to learning including the probability of disagreement between two halfspaces, disagreement outside a band, and the disagreement coefficient.

Active Learning

Robust Communication-Optimal Distributed Clustering Algorithms

no code implementations2 Mar 2017 Pranjal Awasthi, Ainesh Bakshi, Maria-Florina Balcan, Colin White, David Woodruff

In this work, we study the $k$-median and $k$-means clustering problems when the data is distributed across many servers and can contain outliers.

Clustering

Lifelong Learning in Costly Feature Spaces

no code implementations30 Jun 2017 Maria-Florina Balcan, Avrim Blum, Vaishnavh Nagarajan

An important long-term goal in machine learning systems is to build learning agents that, like humans, can learn many tasks over their lifetime, and moreover use information from these tasks to improve their ability to do so efficiently.

Clustering under Local Stability: Bridging the Gap between Worst-Case and Beyond Worst-Case Analysis

no code implementations19 May 2017 Maria-Florina Balcan, Colin White

The typical idea is to design a clustering algorithm that outputs a near-optimal solution, provided the data satisfy a natural stability notion.

Clustering

Learning-Theoretic Foundations of Algorithm Configuration for Combinatorial Partitioning Problems

no code implementations14 Nov 2016 Maria-Florina Balcan, Vaishnavh Nagarajan, Ellen Vitercik, Colin White

We address this problem for clustering, max-cut, and other partitioning problems, such as integer quadratic programming, by designing computationally efficient and sample efficient learning algorithms which receive samples from an application-specific distribution over problem instances and learn a partitioning algorithm with high expected performance.

Clustering Learning Theory

Scalable Influence Maximization for Multiple Products in Continuous-Time Diffusion Networks

no code implementations8 Dec 2016 Nan Du, YIngyu Liang, Maria-Florina Balcan, Manuel Gomez-Rodriguez, Hongyuan Zha, Le Song

A typical viral marketing model identifies influential users in a social network to maximize a single product adoption assuming unlimited user attention, campaign budgets, and time.

Marketing

Noise-Tolerant Life-Long Matrix Completion via Adaptive Sampling

no code implementations NeurIPS 2016 Maria-Florina Balcan, Hongyang Zhang

For this problem, we present an algorithm that returns a matrix of a small error, with sample complexity almost as small as the best prior results in the noiseless case.

Matrix Completion

Communication Efficient Distributed Agnostic Boosting

no code implementations21 Jun 2015 Shang-Tse Chen, Maria-Florina Balcan, Duen Horng Chau

We consider the problem of learning from distributed data in the agnostic setting, i. e., in the presence of arbitrary forms of noise.

Sample Complexity of Automated Mechanism Design

no code implementations NeurIPS 2016 Maria-Florina Balcan, Tuomas Sandholm, Ellen Vitercik

In the traditional economic models, it is assumed that the bidders' valuations are drawn from an underlying distribution and that the auction designer has perfect knowledge of this distribution.

Combinatorial Optimization Learning Theory

Learning Combinatorial Functions from Pairwise Comparisons

no code implementations30 May 2016 Maria-Florina Balcan, Ellen Vitercik, Colin White

However, for real-valued functions, cardinal labels might not be accessible, or it may be difficult for an expert to consistently assign real-valued labels over the entire set of examples.

BIG-bench Machine Learning

Active Learning Algorithms for Graphical Model Selection

no code implementations1 Feb 2016 Gautam Dasarathy, Aarti Singh, Maria-Florina Balcan, Jong Hyuk Park

The problem of learning the structure of a high dimensional graphical model from data has received considerable attention in recent years.

Active Learning Model Selection

$k$-center Clustering under Perturbation Resilience

no code implementations14 May 2015 Maria-Florina Balcan, Nika Haghtalab, Colin White

In this work, we take this approach and provide strong positive results both for the asymmetric and symmetric $k$-center problems under a natural input stability (promise) condition called $\alpha$-perturbation resilience [Bilu and Linia 2012], which states that the optimal solution does not change under any alpha-factor perturbation to the input distances.

Clustering

Communication Efficient Distributed Kernel Principal Component Analysis

no code implementations23 Mar 2015 Maria-Florina Balcan, YIngyu Liang, Le Song, David Woodruff, Bo Xie

Can we perform kernel PCA on the entire dataset in a distributed and communication efficient fashion while maintaining provable and strong guarantees in solution quality?

Local algorithms for interactive clustering

no code implementations24 Dec 2013 Pranjal Awasthi, Maria-Florina Balcan, Konstantin Voevodski

We study the design of interactive clustering algorithms for data sets satisfying natural stability assumptions.

Clustering

Efficient Learning of Linear Separators under Bounded Noise

no code implementations12 Mar 2015 Pranjal Awasthi, Maria-Florina Balcan, Nika Haghtalab, Ruth Urner

We provide the first polynomial time algorithm that can learn linear separators to arbitrarily small excess error in this noise model under the uniform distribution over the unit ball in $\Re^d$, for some constant value of $\eta$.

Active Learning Learning Theory

A Distributed Frank-Wolfe Algorithm for Communication-Efficient Sparse Learning

no code implementations9 Apr 2014 Aurélien Bellet, YIngyu Liang, Alireza Bagheri Garakani, Maria-Florina Balcan, Fei Sha

We further show that the communication cost of dFW is optimal by deriving a lower-bound on the communication cost required to construct an $\epsilon$-approximate solution.

Sparse Learning

Improved Distributed Principal Component Analysis

no code implementations NeurIPS 2014 Maria-Florina Balcan, Vandana Kanchanapally, YIngyu Liang, David Woodruff

We give new algorithms and analyses for distributed PCA which lead to improved communication and computational costs for $k$-means clustering and related problems.

Clustering Computational Efficiency +1

Efficient Representations for Life-Long Learning and Autoencoding

no code implementations6 Nov 2014 Maria-Florina Balcan, Avrim Blum, Santosh Vempala

Specifically, we consider the problem of learning many different target functions over time, that share certain commonalities that are initially unknown to the learning algorithm.

Efficient Clustering with Limited Distance Information

no code implementations9 Aug 2014 Konstantin Voevodski, Maria-Florina Balcan, Heiko Roglin, Shang-Hua Teng, Yu Xia

Given a point set S and an unknown metric d on S, we study the problem of efficiently partitioning S into k clusters while querying few distances between the points.

Clustering

Learning Economic Parameters from Revealed Preferences

no code implementations30 Jul 2014 Maria-Florina Balcan, Amit Daniely, Ruta Mehta, Ruth Urner, Vijay V. Vazirani

In this work we advance this line of work by providing sample complexity guarantees and efficient algorithms for a number of important classes.

Open-Ended Question Answering

Robust Hierarchical Clustering

no code implementations1 Jan 2014 Maria-Florina Balcan, YIngyu Liang, Pramod Gupta

One of the most widely used techniques for data clustering is agglomerative clustering.

Clustering

Active Learning and Best-Response Dynamics

no code implementations NeurIPS 2014 Maria-Florina Balcan, Chris Berlind, Avrim Blum, Emma Cohen, Kaushik Patnaik, Le Song

We examine an important setting for engineered systems in which low-power distributed sensors are each making highly noisy measurements of some unknown target function.

Active Learning Denoising

Envy-Free Classification

no code implementations NeurIPS 2019 Maria-Florina Balcan, Travis Dick, Ritesh Noothigattu, Ariel D. Procaccia

In classic fair division problems such as cake cutting and rent division, envy-freeness requires that each individual (weakly) prefer his allocation to anyone else's.

Classification Fairness +1

Testing Matrix Rank, Optimally

no code implementations18 Oct 2018 Maria-Florina Balcan, Yi Li, David P. Woodruff, Hongyang Zhang

This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix.

Learning Cooperative Games

1 code implementation30 Apr 2015 Maria-Florina Balcan, Ariel D. Procaccia, Yair Zick

This paper explores a PAC (probably approximately correct) learning model in cooperative games.

Computer Science and Game Theory

Risk Bounds for Transferring Representations With and Without Fine-Tuning

no code implementations ICML 2017 Daniel McNamara, Maria-Florina Balcan

If the representation learned from the source task is fixed, we identify conditions on how the tasks relate to obtain an upper bound on target task risk via a VC dimension-based argument.

Word Embeddings

Differentially Private Clustering in High-Dimensional Euclidean Spaces

no code implementations ICML 2017 Maria-Florina Balcan, Travis Dick, YIngyu Liang, Wenlong Mou, Hongyang Zhang

We study the problem of clustering sensitive data while preserving the privacy of individuals represented in the dataset, which has broad applications in practical machine learning and data analysis tasks.

Clustering Vocal Bursts Intensity Prediction

Semi-bandit Optimization in the Dispersed Setting

no code implementations18 Apr 2019 Maria-Florina Balcan, Travis Dick, Wesley Pegden

We apply our semi-bandit results to obtain the first provable guarantees for data-driven algorithm design for linkage-based clustering and we improve the best regret bounds for designing greedy knapsack algorithms.

Clustering

Learning to Optimize Computational Resources: Frugal Training with Generalization Guarantees

no code implementations26 May 2019 Maria-Florina Balcan, Tuomas Sandholm, Ellen Vitercik

Our algorithm can help compile a configuration portfolio, or it can be used to select the input to a configuration algorithm for finite parameter spaces.

Clustering

Learning to Link

no code implementations ICLR 2020 Maria-Florina Balcan, Travis Dick, Manuel Lang

Clustering is an important part of many modern data analysis pipelines, including network analysis and data retrieval.

Clustering Metric Learning +1

Learning piecewise Lipschitz functions in changing environments

no code implementations22 Jul 2019 Maria-Florina Balcan, Travis Dick, Dravyansh Sharma

We consider the class of piecewise Lipschitz functions, which is the most general online setting considered in the literature for the problem, and arises naturally in various combinatorial algorithm selection problems where utility functions can have sharp discontinuities.

Clustering Online Clustering

How much data is sufficient to learn high-performing algorithms? Generalization guarantees for data-driven algorithm design

no code implementations8 Aug 2019 Maria-Florina Balcan, Dan DeBlasio, Travis Dick, Carl Kingsford, Tuomas Sandholm, Ellen Vitercik

We provide a broadly applicable theory for deriving generalization guarantees that bound the difference between the algorithm's average performance over the training set and its expected performance.

Clustering Generalization Bounds

Refined bounds for algorithm configuration: The knife-edge of dual class approximability

no code implementations ICML 2020 Maria-Florina Balcan, Tuomas Sandholm, Ellen Vitercik

We answer this question for algorithm configuration problems that exhibit a widely-applicable structure: the algorithm's performance as a function of its parameters can be approximated by a "simple" function.

Noise in Classification

no code implementations10 Oct 2020 Maria-Florina Balcan, Nika Haghtalab

This chapter considers the computational and statistical aspects of learning linear thresholds in presence of noise.

Classification General Classification

Data-driven Algorithm Design

no code implementations14 Nov 2020 Maria-Florina Balcan

Data driven algorithm design is an important aspect of modern data science and algorithm design.

Generalization in portfolio-based algorithm selection

no code implementations24 Dec 2020 Maria-Florina Balcan, Tuomas Sandholm, Ellen Vitercik

This algorithm configuration procedure works by first selecting a portfolio of diverse algorithm parameter settings, and then, on a given problem instance, using an algorithm selector to choose a parameter setting from the portfolio with strong predicted performance.

Data driven semi-supervised learning

no code implementations NeurIPS 2021 Maria-Florina Balcan, Dravyansh Sharma

Over the past decades, several elegant graph-based semi-supervised learning algorithms for how to infer the labels of the unlabeled examples given the graph and a few labeled examples have been proposed.

Active Learning

Learning-to-learn non-convex piecewise-Lipschitz functions

no code implementations NeurIPS 2021 Maria-Florina Balcan, Mikhail Khodak, Dravyansh Sharma, Ameet Talwalkar

We analyze the meta-learning of the initialization and step-size of learning algorithms for piecewise-Lipschitz functions, a non-convex setting with applications to both machine learning and algorithms.

Meta-Learning

Improved Sample Complexity Bounds for Branch-and-Cut

no code implementations18 Nov 2021 Maria-Florina Balcan, Siddharth Prasad, Tuomas Sandholm, Ellen Vitercik

If the training set is too small, a configuration may have good performance over the training set but poor performance on future integer programs.

On Weight-Sharing and Bilevel Optimization in Architecture Search

no code implementations25 Sep 2019 Mikhail Khodak, Liam Li, Maria-Florina Balcan, Ameet Talwalkar

Weight-sharing—the simultaneous optimization of multiple neural networks using the same parameters—has emerged as a key component of state-of-the-art neural architecture search.

Bilevel Optimization feature selection +1

Learning Predictions for Algorithms with Predictions

no code implementations18 Feb 2022 Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar, Sergei Vassilvitskii

A burgeoning paradigm in algorithm design is the field of algorithms with predictions, in which algorithms can take advantage of a possibly-imperfect prediction of some aspect of the problem.

Scheduling

Robustly-reliable learners under poisoning attacks

no code implementations8 Mar 2022 Maria-Florina Balcan, Avrim Blum, Steve Hanneke, Dravyansh Sharma

Remarkably, we provide a complete characterization of learnability in this setting, in particular, nearly-tight matching upper and lower bounds on the region that can be certified, as well as efficient algorithms for computing this region given an ERM oracle.

Data Poisoning

Output-sensitive ERM-based techniques for data-driven algorithm design

no code implementations7 Apr 2022 Maria-Florina Balcan, Christopher Seiler, Dravyansh Sharma

Data-driven algorithm design is a promising, learning-based approach for beyond worst-case analysis of algorithms with tunable parameters.

Clustering

Structural Analysis of Branch-and-Cut and the Learnability of Gomory Mixed Integer Cuts

no code implementations15 Apr 2022 Maria-Florina Balcan, Siddharth Prasad, Tuomas Sandholm, Ellen Vitercik

These guarantees apply to infinite families of cutting planes, such as the family of Gomory mixed integer cuts, which are responsible for the main breakthrough speedups of integer programming solvers.

BIG-bench Machine Learning

Meta-Learning Adversarial Bandits

no code implementations27 May 2022 Maria-Florina Balcan, Keegan Harris, Mikhail Khodak, Zhiwei Steven Wu

We study online learning with bandit feedback across multiple tasks, with the goal of improving average performance across tasks if they are similar according to some natural task-similarity measure.

Meta-Learning Multi-Armed Bandits

Provably tuning the ElasticNet across instances

no code implementations20 Jul 2022 Maria-Florina Balcan, Mikhail Khodak, Dravyansh Sharma, Ameet Talwalkar

We consider the problem of tuning the regularization parameters of Ridge regression, LASSO, and the ElasticNet across multiple problem instances, a setting that encompasses both cross-validation and multi-task hyperparameter optimization.

Hyperparameter Optimization regression

Learning Revenue Maximizing Menus of Lotteries and Two-Part Tariffs

no code implementations22 Feb 2023 Maria-Florina Balcan, Hedyeh Beyhaghi

We advance a recently flourishing line of work at the intersection of learning theory and computational economics by studying the learnability of two classes of mechanisms prominent in economics, namely menus of lotteries and two-part tariffs.

Learning Theory Vocal Bursts Valence Prediction

Learning with Explanation Constraints

no code implementations NeurIPS 2023 Rattana Pukdee, Dylan Sam, J. Zico Kolter, Maria-Florina Balcan, Pradeep Ravikumar

In this paper, we formalize this notion as learning from explanation constraints and provide a learning theoretic framework to analyze how such explanations can improve the learning of our models.

Learning to Relax: Setting Solver Parameters Across a Sequence of Linear System Instances

no code implementations3 Oct 2023 Mikhail Khodak, Edmond Chow, Maria-Florina Balcan, Ameet Talwalkar

For this method, we prove that a bandit online learning algorithm -- using only the number of iterations as feedback -- can select parameters for a sequence of instances such that the overall cost approaches that of the best fixed $\omega$ as the sequence length increases.

Regret Minimization in Stackelberg Games with Side Information

no code implementations13 Feb 2024 Keegan Harris, Zhiwei Steven Wu, Maria-Florina Balcan

Stackelberg games are perhaps one of the biggest success stories of algorithmic game theory over the last decade, as algorithms for playing in Stackelberg games have been deployed in many real-world domains including airport security, anti-poaching efforts, and cyber-crime prevention.

Cannot find the paper you are looking for? You can Submit a new open access paper.