Search Results for author: Yuekai Sun

Found 46 papers, 16 papers with code

Learning In Reverse Causal Strategic Environments With Ramifications on Two Sided Markets

no code implementations20 Apr 2024 Seamus Somerstep, Yuekai Sun, Ya'acov Ritov

Motivated by equilibrium models of labor markets, we develop a formulation of causal strategic classification in which strategic agents can directly manipulate their outcomes.

Aligners: Decoupling LLMs and Alignment

no code implementations7 Mar 2024 Lilian Ngweta, Mayank Agarwal, Subha Maity, Alex Gittens, Yuekai Sun, Mikhail Yurochkin

Large Language Models (LLMs) need to be aligned with human expectations to ensure their safety and utility in most applications.

tinyBenchmarks: evaluating LLMs with fewer examples

2 code implementations22 Feb 2024 Felipe Maia Polo, Lucas Weber, Leshem Choshen, Yuekai Sun, Gongjun Xu, Mikhail Yurochkin

The versatility of large language models (LLMs) led to the creation of diverse benchmarks that thoroughly test a variety of language models' abilities.

Multiple-choice

Estimating Fréchet bounds for validating programmatic weak supervision

no code implementations7 Dec 2023 Felipe Maia Polo, Mikhail Yurochkin, Moulinath Banerjee, Subha Maity, Yuekai Sun

We develop methods for estimating Fr\'echet bounds on (possibly high-dimensional) distribution classes in which some variables are continuous-valued.

Fusing Models with Complementary Expertise

no code implementations2 Oct 2023 Hongyi Wang, Felipe Maia Polo, Yuekai Sun, Souvik Kundu, Eric Xing, Mikhail Yurochkin

Training AI models that generalize across tasks and domains has long been among the open problems driving AI research.

Multiple-choice text-classification +2

An Investigation of Representation and Allocation Harms in Contrastive Learning

1 code implementation2 Oct 2023 Subha Maity, Mayank Agarwal, Mikhail Yurochkin, Yuekai Sun

In this paper, we demonstrate that contrastive learning (CL), a popular variant of SSL, tends to collapse representations of minority groups with certain majority groups.

Contrastive Learning Self-Supervised Learning +1

Conditional independence testing under misspecified inductive biases

1 code implementation NeurIPS 2023 Felipe Maia Polo, Yuekai Sun, Moulinath Banerjee

Conditional independence (CI) testing is a fundamental and challenging task in modern statistics and machine learning.

regression

ISAAC Newton: Input-based Approximate Curvature for Newton's Method

1 code implementation1 May 2023 Felix Petersen, Tobias Sutter, Christian Borgelt, Dongsung Huh, Hilde Kuehne, Yuekai Sun, Oliver Deussen

We present ISAAC (Input-baSed ApproximAte Curvature), a novel method that conditions the gradient using selected second-order information and has an asymptotically vanishing computational overhead, assuming a batch size smaller than the number of neurons.

Second-order methods

Simple Disentanglement of Style and Content in Visual Representations

1 code implementation20 Feb 2023 Lilian Ngweta, Subha Maity, Alex Gittens, Yuekai Sun, Mikhail Yurochkin

Learning visual representations with interpretable features, i. e., disentangled representations, remains a challenging problem.

Disentanglement Domain Generalization

How does overparametrization affect performance on minority groups?

1 code implementation7 Jun 2022 Subha Maity, Saptarshi Roy, Songkai Xue, Mikhail Yurochkin, Yuekai Sun

The benefits of overparameterization for the overall performance of modern machine learning (ML) models are well known.

regression

Predictor-corrector algorithms for stochastic optimization under gradual distribution shift

1 code implementation26 May 2022 Subha Maity, Debarghya Mukherjee, Moulinath Banerjee, Yuekai Sun

Time-varying stochastic optimization problems frequently arise in machine learning practice (e. g. gradual domain shift, object tracking, strategic classification).

Object Tracking Stochastic Optimization

Understanding new tasks through the lens of training data via exponential tilting

1 code implementation26 May 2022 Subha Maity, Mikhail Yurochkin, Moulinath Banerjee, Yuekai Sun

However, it is conceivable that the training data can be reweighted to be more representative of the new (target) task.

Model Selection

Domain Adaptation meets Individual Fairness. And they get along

no code implementations1 May 2022 Debarghya Mukherjee, Felix Petersen, Mikhail Yurochkin, Yuekai Sun

In this paper, we leverage this connection between algorithmic fairness and distribution shifts to show that algorithmic fairness interventions can help ML models overcome distribution shifts, and that domain adaptation methods (for overcoming distribution shifts) can mitigate algorithmic biases.

Domain Adaptation Fairness

Achieving Representative Data via Convex Hull Feasibility Sampling Algorithms

no code implementations13 Apr 2022 Laura Niss, Yuekai Sun, Ambuj Tewari

Sampling biases in training data are a major source of algorithmic biases in machine learning systems.

Individually Fair Gradient Boosting

no code implementations ICLR 2021 Alexander Vargo, Fan Zhang, Mikhail Yurochkin, Yuekai Sun

Gradient boosting is a popular method for machine learning from tabular data, which arise often in applications where algorithmic fairness is a concern.

Fairness

Statistical inference for individual fairness

1 code implementation ICLR 2021 Subha Maity, Songkai Xue, Mikhail Yurochkin, Yuekai Sun

As we rely on machine learning (ML) models to make more consequential decisions, the issue of ML models perpetuating or even exacerbating undesirable historical biases (e. g., gender and racial biases) has come to the fore of the public's attention.

Adversarial Attack Fairness

Individually Fair Ranking

no code implementations19 Mar 2021 Amanda Bower, Hamid Eftekhari, Mikhail Yurochkin, Yuekai Sun

We develop an algorithm to train individually fair learning-to-rank (LTR) models.

Fairness Learning-To-Rank

Outlier Robust Optimal Transport

no code implementations1 Jan 2021 Debarghya Mukherjee, Aritra Guha, Justin Solomon, Yuekai Sun, Mikhail Yurochkin

In light of recent advances in solving the OT problem, OT distances are widely used as loss functions in minimum distance estimation.

Outlier Detection

There is no trade-off: enforcing fairness can improve accuracy

no code implementations28 Sep 2020 Subha Maity, Debarghya Mukherjee, Mikhail Yurochkin, Yuekai Sun

If the algorithmic biases in an ML model are due to sampling biases in the training data, then enforcing algorithmic fairness may improve the performance of the ML model on unbiased test data.

Fairness

Two Simple Ways to Learn Individual Fairness Metrics from Data

no code implementations19 Jun 2020 Debarghya Mukherjee, Mikhail Yurochkin, Moulinath Banerjee, Yuekai Sun

Individual fairness is an intuitive definition of algorithmic fairness that addresses some of the drawbacks of group fairness.

Fairness Vocal Bursts Valence Prediction

Minimax optimal approaches to the label shift problem in non-parametric settings

no code implementations23 Mar 2020 Subha Maity, Yuekai Sun, Moulinath Banerjee

We study the minimax rates of the label shift problem in non-parametric classification.

Attribute

Auditing ML Models for Individual Bias and Unfairness

no code implementations11 Mar 2020 Songkai Xue, Mikhail Yurochkin, Yuekai Sun

We consider the task of auditing ML models for individual bias/unfairness.

Federated Learning with Matched Averaging

1 code implementation ICLR 2020 Hongyi Wang, Mikhail Yurochkin, Yuekai Sun, Dimitris Papailiopoulos, Yasaman Khazaeni

Federated learning allows edge devices to collaboratively learn a shared model while keeping the training data on device, decoupling the ability to do model training from the need to store the data in the cloud.

Federated Learning

Meta-analysis of heterogeneous data: integrative sparse regression in high-dimensions

1 code implementation26 Dec 2019 Subha Maity, Yuekai Sun, Moulinath Banerjee

We consider the task of meta-analysis in high-dimensional settings in which the data sources are similar but non-identical.

regression Vocal Bursts Intensity Prediction

Training individually fair ML models with Sensitive Subspace Robustness

2 code implementations ICLR 2020 Mikhail Yurochkin, Amanda Bower, Yuekai Sun

We consider training machine learning models that are fair in the sense that their performance is invariant under certain sensitive perturbations to the inputs.

BIG-bench Machine Learning Fairness

Dirichlet Simplex Nest and Geometric Inference

1 code implementation27 May 2019 Mikhail Yurochkin, Aritra Guha, Yuekai Sun, XuanLong Nguyen

We propose Dirichlet Simplex Nest, a class of probabilistic models suitable for a variety of data types, and develop fast and provably accurate inference algorithms by accounting for the model's convex geometry and low dimensional simplicial structure.

Precision Matrix Estimation with Noisy and Missing Data

no code implementations7 Apr 2019 Roger Fan, Byoungwook Jang, Yuekai Sun, Shuheng Zhou

Estimating conditional dependence graphs and precision matrices are some of the most common problems in modern statistics and machine learning.

An inexact subsampled proximal Newton-type method for large-scale machine learning

no code implementations28 Aug 2017 Xuanqing Liu, Cho-Jui Hsieh, Jason D. Lee, Yuekai Sun

We propose a fast proximal Newton-type algorithm for minimizing regularized finite sums that returns an $\epsilon$-suboptimal point in $\tilde{\mathcal{O}}(d(n + \sqrt{\kappa d})\log(\frac{1}{\epsilon}))$ FLOPS, where $n$ is number of samples, $d$ is feature dimension, and $\kappa$ is the condition number.

BIG-bench Machine Learning

Evaluating the statistical significance of biclusters

no code implementations NeurIPS 2015 Jason D. Lee, Yuekai Sun, Jonathan E. Taylor

Biclustering (also known as submatrix localization) is a problem of high practical relevance in exploratory analysis of high-dimensional data.

Communication-efficient sparse regression: a one-shot approach

no code implementations14 Mar 2015 Jason D. Lee, Yuekai Sun, Qiang Liu, Jonathan E. Taylor

We devise a one-shot approach to distributed sparse regression in the high-dimensional setting.

regression

On model selection consistency of penalized M-estimators: a geometric theory

no code implementations NeurIPS 2013 Jason D. Lee, Yuekai Sun, Jonathan E. Taylor

Penalized M-estimators are used in diverse areas of science and engineering to fit high-dimensional models with some low-dimensional structure.

Model Selection

Learning Mixtures of Linear Classifiers

no code implementations11 Nov 2013 Yuekai Sun, Stratis Ioannidis, Andrea Montanari

We consider a discriminative learning (regression) problem, whereby the regression function is a convex combination of k linear classifiers.

regression

On model selection consistency of regularized M-estimators

no code implementations31 May 2013 Jason D. Lee, Yuekai Sun, Jonathan E. Taylor

Regularized M-estimators are used in diverse areas of science and engineering to fit high-dimensional models with some low-dimensional structure.

Model Selection

Proximal Newton-type methods for minimizing composite functions

1 code implementation7 Jun 2012 Jason D. Lee, Yuekai Sun, Michael A. Saunders

We generalize Newton-type methods for minimizing smooth functions to handle a sum of two convex functions: a smooth function and a nonsmooth function with a simple proximal mapping.

Vocal Bursts Type Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.