2 code implementations • 23 Jul 2021 • Abhishek Kumar, Harikrishna Narasimhan, Andrew Cotter
We consider a popular family of constrained optimization problems arising in machine learning that involve optimizing a non-decomposable evaluation metric with a certain thresholded form, while constraining another metric of interest.
no code implementations • ICLR 2022 • Heinrich Jiang, Harikrishna Narasimhan, Dara Bahri, Andrew Cotter, Afshin Rostamizadeh
In real-world systems, models are frequently updated as more data becomes available, and in addition to achieving high accuracy, the goal is to also maintain a low difference in predictions compared to the base model (i. e. predictive "churn").
no code implementations • 13 Feb 2021 • Andrew Cotter, Aditya Krishna Menon, Harikrishna Narasimhan, Ankit Singh Rawat, Sashank J. Reddi, Yichen Zhou
Distillation is the technique of training a "student" model based on examples that are labeled by a separate "teacher" model, which itself is trained on a labeled dataset.
no code implementations • NeurIPS 2020 • Harikrishna Narasimhan, Andrew Cotter, Yichen Zhou, Serena Wang, Wenshuo Guo
In machine learning applications such as ranking fairness or fairness over intersectional groups, one often encounters optimization problems with an extremely large number of constraints.
1 code implementation • NeurIPS 2020 • Serena Wang, Wenshuo Guo, Harikrishna Narasimhan, Andrew Cotter, Maya Gupta, Michael. I. Jordan
Second, we introduce two new approaches using robust optimization that, unlike the naive approach of only relying on $\hat{G}$, are guaranteed to satisfy fairness criteria on the true protected groups G while minimizing a training objective.
2 code implementations • NeurIPS 2019 • Harikrishna Narasimhan, Andrew Cotter, Maya Gupta
We present a general framework for solving a large class of learning problems with non-linear functions of classification rates.
1 code implementation • NeurIPS 2019 • Andrew Cotter, Maya Gupta, Harikrishna Narasimhan
Stochastic classifiers arise in a number of machine learning problems, and have become especially prominent of late, as they often result from constrained optimization problems, e. g. for fairness, churn, or custom losses.
no code implementations • 6 Sep 2019 • Harikrishna Narasimhan, Andrew Cotter, Maya Gupta
We present a general framework for solving a large class of learning problems with non-linear functions of classification rates.
1 code implementation • 12 Jun 2019 • Harikrishna Narasimhan, Andrew Cotter, Maya Gupta, Serena Wang
We present pairwise fairness metrics for ranking models and regression models that form analogues of statistical fairness notions such as equal opportunity, equal accuracy, and statistical parity.
no code implementations • NeurIPS 2018 • Maya Gupta, Dara Bahri, Andrew Cotter, Kevin Canini
We investigate machine learning models that can provide diminishing returns and accelerating returns guarantees to capture prior knowledge or policies about how outputs should depend on inputs.
1 code implementation • 11 Sep 2018 • Andrew Cotter, Heinrich Jiang, Serena Wang, Taman Narayan, Maya Gupta, Seungil You, Karthik Sridharan
This new formulation leads to an algorithm that produces a stochastic classifier by playing a two-player non-zero-sum game solving for what we call a semi-coarse correlated equilibrium, which in turn corresponds to an approximately optimal and feasible solution to the constrained optimization problem.
no code implementations • ICML 2018 • Andrew Cotter, Mahdi Milani Fard, Seungil You, Maya Gupta, Jeff Bilmes
We introduce the problem of grouping a finite ground set into blocks where each block is a subset of the ground set and where: (i) the blocks are individually highly valued by a submodular function (both robustly and in the average case) while satisfying block-specific matroid constraints; and (ii) block scores interact where blocks are jointly scored highly, thus making the blocks mutually non-redundant.
1 code implementation • 29 Jun 2018 • Andrew Cotter, Maya Gupta, Heinrich Jiang, Nathan Srebro, Karthik Sridharan, Serena Wang, Blake Woodworth, Seungil You
Classifiers can be trained with data-dependent constraints to satisfy fairness goals, reduce churn, achieve a targeted false positive rate, or other policy goals.
no code implementations • 28 Jun 2018 • Maya Gupta, Andrew Cotter, Mahdi Milani Fard, Serena Wang
We consider the problem of improving fairness when one lacks access to a dataset labeled with protected groups, making it difficult to take advantage of strategies that can improve fairness but require protected group labels, either at training or runtime.
no code implementations • 31 May 2018 • Andrew Cotter, Maya Gupta, Heinrich Jiang, James Muller, Taman Narayan, Serena Wang, Tao Zhu
We propose learning flexible but interpretable functions that aggregate a variable-length set of permutation-invariant feature vectors to predict a label.
1 code implementation • 17 Apr 2018 • Andrew Cotter, Heinrich Jiang, Karthik Sridharan
For both the proxy-Lagrangian and Lagrangian formulations, however, we prove that this classifier, instead of having unbounded size, can be taken to be a distribution over no more than m+1 models (where m is the number of constraints).
no code implementations • NeurIPS 2016 • Mahdi Milani Fard, Kevin Canini, Andrew Cotter, Jan Pfeifer, Maya Gupta
For many machine learning problems, there are some inputs that are known to be positively (or negatively) related to the output, and in such cases training the model to respect that monotonic relationship can provide regularization, and makes the model more interpretable.
no code implementations • NeurIPS 2016 • Gabriel Goh, Andrew Cotter, Maya Gupta, Michael Friedlander
The goal of minimizing misclassification error on a training set is often just one of several real-world goals that might be defined on different datasets.
no code implementations • 15 Dec 2015 • Andrew Cotter, Maya Gupta, Jan Pfeifer
Minimizing empirical risk subject to a set of constraints can be a useful strategy for learning restricted classes of functions, such as monotonic functions, submodular functions, classifiers that guarantee a certain class label for some subset of examples, etc.
no code implementations • 23 May 2015 • Maya Gupta, Andrew Cotter, Jan Pfeifer, Konstantin Voevodski, Kevin Canini, Alexander Mangylov, Wojtek Moczydlowski, Alex van Esbroeck
Real-world machine learning applications may require functions that are fast-to-evaluate and interpretable.
no code implementations • 15 Aug 2013 • Andrew Cotter
In Part ii, we will consider the unsupervised problem of Principal Component Analysis, for which the learning task is to find the directions which contain most of the variance of the data distribution.
no code implementations • NeurIPS 2013 • Raman Arora, Andrew Cotter, Nathan Srebro
We study PCA as a stochastic optimization problem and propose a novel stochastic approximation algorithm which we refer to as "Matrix Stochastic Gradient" (MSG), as well as a practical variant, Capped MSG.
no code implementations • NeurIPS 2011 • Andrew Cotter, Ohad Shamir, Nati Srebro, Karthik Sridharan
Mini-batch algorithms have recently received significant attention as a way to speed-up stochastic convex optimization problems.