Search Results for author: Edgar Dobriban

Found 55 papers, 34 papers with code

One-shot Distributed Ridge Regression in High Dimensions

no code implementations • ICML 2020 • Yue Sheng, Edgar Dobriban

To scale up data analysis, distributed and parallel computing approaches are increasingly needed.

Paper
Add Code

Uncertainty in Language Models: Assessment through Rank-Calibration

no code implementations • 4 Apr 2024 • Xinmeng Huang, Shuo Li, Mengxin Yu, Matteo Sesia, Hamed Hassani, Insup Lee, Osbert Bastani, Edgar Dobriban

Language Models (LMs) have shown promising performance in natural language generation.

Text Generation

Paper
Add Code

Inference in Randomized Least Squares and PCA via Normality of Quadratic Forms

1 code implementation • 1 Apr 2024 • Leda Wang, Zhixiang Zhang, Edgar Dobriban

To our knowledge, no comparable methods are available for SSE and for SRHT in PCA.

Paper
Code

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models

1 code implementation • 28 Mar 2024 • Patrick Chao, Edoardo Debenedetti, Alexander Robey, Maksym Andriushchenko, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George J. Pappas, Florian Tramer, Hamed Hassani, Eric Wong

To address these challenges, we introduce JailbreakBench, an open-sourced benchmark with the following components: (1) an evolving repository of state-of-the-art adversarial prompts, which we refer to as jailbreak artifacts; (2) a jailbreaking dataset comprising 100 behaviors -- both original and sourced from prior work -- which align with OpenAI's usage policies; (3) a standardized evaluation framework that includes a clearly defined threat model, system prompts, chat templates, and scoring functions; and (4) a leaderboard that tracks the performance of attacks and defenses for various LLMs.

Paper
Code

Minimax Optimal Fair Classification with Bounded Demographic Disparity

1 code implementation • 27 Mar 2024 • Xianli Zeng, Guang Cheng, Edgar Dobriban

Mitigating the disparate impact of statistical machine learning methods is crucial for ensuring fairness.

Binary Classification Classification +1

Paper
Code

Bayes-Optimal Fair Classification with Linear Disparity Constraints via Pre-, In-, and Post-processing

1 code implementation • 5 Feb 2024 • Xianli Zeng, Guang Cheng, Edgar Dobriban

To address this, we develop methods for Bayes-optimal fair classification, aiming to minimize classification error subject to given group fairness constraints.

Attribute Classification +2

Paper
Code

SymmPI: Predictive Inference for Data with Group Symmetries

1 code implementation • 26 Dec 2023 • Edgar Dobriban, Mengxin Yu

Methods for predictive inference have been developed under a variety of assumptions, often -- for instance, in standard conformal prediction -- relying on the invariance of the distribution of the data under special groups of transformations such as permutation groups.

Conformal Prediction valid

Paper
Code

PAC Prediction Sets Under Label Shift

1 code implementation • 19 Oct 2023 • Wenwen Si, Sangdon Park, Insup Lee, Edgar Dobriban, Osbert Bastani

We propose a novel algorithm for constructing prediction sets with PAC guarantees in the label shift setting.

Uncertainty Quantification

Paper
Code

Jailbreaking Black Box Large Language Models in Twenty Queries

1 code implementation • 12 Oct 2023 • Patrick Chao, Alexander Robey, Edgar Dobriban, Hamed Hassani, George J. Pappas, Eric Wong

PAIR -- which is inspired by social engineering attacks -- uses an attacker LLM to automatically generate jailbreaks for a separate targeted LLM without human intervention.

253

Paper
Code

A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks

no code implementations • 11 Oct 2023 • Behrad Moniri, Donghwan Lee, Hamed Hassani, Edgar Dobriban

However, with a constant gradient descent step size, this spike only carries information from the linear component of the target function and therefore learning non-linear components is impossible.

Paper
Add Code

Statistical Estimation Under Distribution Shift: Wasserstein Perturbations and Minimax Theory

2 code implementations • 3 Aug 2023 • Patrick Chao, Edgar Dobriban

Under a squared loss for mean estimation and prediction error in linear regression, we find the exact minimax risk, a least favorable perturbation, and show that the sample mean and least squares estimators are respectively optimal.

Density Estimation regression

Paper
Code

Efficient and Multiply Robust Risk Estimation under General Forms of Dataset Shift

no code implementations • 28 Jun 2023 • Hongxiang Qiu, Eric Tchetgen Tchetgen, Edgar Dobriban

Despite extensive literature on dataset shift, limited works address how to efficiently use the auxiliary populations to improve the accuracy of risk evaluation for a given machine learning task in the target population.

Domain Adaptation Transfer Learning

Paper
Add Code

Optimal Heterogeneous Collaborative Linear Regression and Contextual Bandits

no code implementations • 9 Jun 2023 • Xinmeng Huang, Kan Xu, Donghwan Lee, Hamed Hassani, Hamsa Bastani, Edgar Dobriban

MOLAR improves the dependence of the estimation error on the data dimension, compared to independent least squares estimates.

Multi-Armed Bandits regression

Paper
Add Code

Sharp-SSL: Selective high-dimensional axis-aligned random projections for semi-supervised learning

no code implementations • 18 Apr 2023 • Tengyao Wang, Edgar Dobriban, Milana Gataric, Richard J. Samworth

We propose a new method for high-dimensional semi-supervised learning problems based on the careful aggregation of the results of a low-dimensional procedure applied to many axis-aligned random projections of the data.

Paper
Add Code

Demystifying Disagreement-on-the-Line in High Dimensions

1 code implementation • 31 Jan 2023 • Donghwan Lee, Behrad Moniri, Xinmeng Huang, Edgar Dobriban, Hamed Hassani

Evaluating the performance of machine learning models under distribution shift is challenging, especially when we only have unlabeled data from the shifted (target) domain, along with labeled data from the original (source) domain.

Vocal Bursts Intensity Prediction

Paper
Code

Conformal Frequency Estimation using Discrete Sketched Data with Coverage for Distinct Queries

1 code implementation • 9 Nov 2022 • Matteo Sesia, Stefano Favaro, Edgar Dobriban

This paper develops conformal inference methods to construct a confidence interval for the frequency of a queried object in a very large discrete data set, based on a sketch with a lower memory footprint.

valid

Paper
Code

PAC Prediction Sets for Meta-Learning

no code implementations • 6 Jul 2022 • Sangdon Park, Edgar Dobriban, Insup Lee, Osbert Bastani

Uncertainty quantification is a key component of machine learning models targeted at safety-critical systems such as in healthcare or autonomous vehicles.

Autonomous Vehicles Meta-Learning +1

Paper
Add Code

Pursuit of a Discriminative Representation for Multiple Subspaces via Sequential Games

1 code implementation • 18 Jun 2022 • Druv Pai, Michael Psenka, Chih-Yuan Chiu, Manxi Wu, Edgar Dobriban, Yi Ma

We consider the problem of learning discriminative representations for data in a high-dimensional space with distribution supported on or around multiple low-dimensional linear subspaces.

Representation Learning

Paper
Code

Unified Fourier-based Kernel and Nonlinearity Design for Equivariant Networks on Homogeneous Spaces

no code implementations • 16 Jun 2022 • Yinshuang Xu, Jiahui Lei, Edgar Dobriban, Kostas Daniilidis

We present a unified derivation of kernels via the Fourier domain by leveraging the sparsity of Fourier coefficients of the lifted feature fields.

Point Cloud Classification

Paper
Add Code

Memory Classifiers: Two-stage Classification for Robustness in Machine Learning

no code implementations • 10 Jun 2022 • Souradeep Dutta, Yahan Yang, Elena Bernardis, Edgar Dobriban, Insup Lee

We propose a new method for classification which can improve robustness to distribution shifts, by combining expert knowledge about the ``high-level" structure of the data with standard classifiers.

BIG-bench Machine Learning Classification +3

Paper
Add Code

Collaborative Learning of Discrete Distributions under Heterogeneity and Communication Constraints

no code implementations • 1 Jun 2022 • Xinmeng Huang, Donghwan Lee, Edgar Dobriban, Hamed Hassani

In modern machine learning, users often have to collaborate to learn the distribution of the data.

Paper
Add Code

PAC-Wrap: Semi-Supervised PAC Anomaly Detection

no code implementations • 22 May 2022 • Shuo Li, Xiayan Ji, Edgar Dobriban, Oleg Sokolsky, Insup Lee

Anomaly detection is essential for preventing hazardous outcomes for safety-critical applications like autonomous driving.

Autonomous Driving Unsupervised Anomaly Detection

Paper
Add Code

Fair Bayes-Optimal Classifiers Under Predictive Parity

1 code implementation • 15 May 2022 • Xianli Zeng, Edgar Dobriban, Guang Cheng

This paper considers predictive parity, which requires equalizing the probability of success given a positive prediction among different protected groups.

Paper
Code

SE(3)-Equivariant Attention Networks for Shape Reconstruction in Function Space

no code implementations • 5 Apr 2022 • Evangelos Chatzipantazis, Stefanos Pertigkiozoglou, Edgar Dobriban, Kostas Daniilidis

In contrast to previous shape reconstruction methods that align the input to a regular grid, we operate directly on the irregular point cloud.

3D Shape Reconstruction

Paper
Add Code

Prediction Sets Adaptive to Unknown Covariate Shift

1 code implementation • 11 Mar 2022 • Hongxiang Qiu, Edgar Dobriban, Eric Tchetgen Tchetgen

Predicting sets of outcomes -- instead of unique outcomes -- is a promising solution to uncertainty quantification in statistical learning.

Uncertainty Quantification

Paper
Code

T-Cal: An optimal test for the calibration of predictive models

1 code implementation • 3 Mar 2022 • Donghwan Lee, Xinmeng Huang, Hamed Hassani, Edgar Dobriban

We find that detecting mis-calibration is only possible when the conditional probabilities of the classes are sufficiently smooth functions of the predictions.

Paper
Code

Exploring with Sticky Mittens: Reinforcement Learning with Expert Interventions via Option Templates

1 code implementation • 25 Feb 2022 • Souradeep Dutta, Kaustubh Sridhar, Osbert Bastani, Edgar Dobriban, James Weimer, Insup Lee, Julia Parish-Morris

We formulate expert intervention as allowing the agent to execute option templates before learning an implementation.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Bayes-Optimal Classifiers under Group Fairness

1 code implementation • 20 Feb 2022 • Xianli Zeng, Edgar Dobriban, Guang Cheng

Machine learning algorithms are becoming integrated into more and more high-stakes decision-making processes, such as in social welfare issues.

BIG-bench Machine Learning Decision Making +1

Paper
Code

iDECODe: In-distribution Equivariance for Conformal Out-of-distribution Detection

no code implementations • 7 Jan 2022 • Ramneet Kaur, Susmit Jha, Anirban Roy, Sangdon Park, Edgar Dobriban, Oleg Sokolsky, Insup Lee

We propose the new method iDECODe, leveraging in-distribution equivariance for conformal OOD detection.

Anomaly Detection Out-of-Distribution Detection +1

Paper
Add Code

Learning Augmentation Distributions using Transformed Risk Minimization

no code implementations • 16 Nov 2021 • Evangelos Chatzipantazis, Stefanos Pertigkiozoglou, Kostas Daniilidis, Edgar Dobriban

We propose a new \emph{Transformed Risk Minimization} (TRM) framework as an extension of classical risk minimization.

Rotated MNIST

Paper
Add Code

Solon: Communication-efficient Byzantine-resilient Distributed Training via Redundant Gradients

no code implementations • 4 Oct 2021 • Lingjiao Chen, Leshang Chen, Hongyi Wang, Susan Davidson, Edgar Dobriban

There has been a growing need to provide Byzantine-resilience in distributed model training.

Paper
Add Code

Comparing Classes of Estimators: When does Gradient Descent Beat Ridge Regression in Linear Models?

1 code implementation • 26 Aug 2021 • Dominic Richards, Edgar Dobriban, Patrick Rebeschini

Methods for learning from data depend on various types of tuning parameters, such as penalization strength or step size.

regression Unity

Paper
Code

PAC Prediction Sets Under Covariate Shift

1 code implementation • ICLR 2022 • Sangdon Park, Edgar Dobriban, Insup Lee, Osbert Bastani

Our approach focuses on the setting where there is a covariate shift from the source distribution (where we have labeled training examples) to the target distribution (for which we want to quantify uncertainty).

Uncertainty Quantification

Paper
Code

Understanding Generalization in Adversarial Training via the Bias-Variance Decomposition

1 code implementation • 17 Mar 2021 • Yaodong Yu, Zitong Yang, Edgar Dobriban, Jacob Steinhardt, Yi Ma

To investigate this gap, we decompose the test risk into its bias and variance components and study their behavior as a function of adversarial training perturbation radii ($\varepsilon$).

Paper
Code

Optimal Iterative Sketching Methods with the Subsampled Randomized Hadamard Transform

no code implementations • NeurIPS 2020 • Jonathan Lacotte, Sifan Liu, Edgar Dobriban, Mert Pilanci

These show that the convergence rate for Haar and randomized Hadamard matrices are identical, and asymptotically improve upon Gaussian random projections.

Dimensionality Reduction

Paper
Add Code

Sparse sketches with small inversion bias

no code implementations • 21 Nov 2020 • Michał Dereziński, Zhenyu Liao, Edgar Dobriban, Michael W. Mahoney

For a tall $n\times d$ matrix $A$ and a random $m\times n$ sketching matrix $S$, the sketched estimate of the inverse covariance matrix $(A^\top A)^{-1}$ is typically biased: $E[(\tilde A^\top\tilde A)^{-1}]\ne(A^\top A)^{-1}$, where $\tilde A=SA$.

Distributed Optimization

Paper
Add Code

What causes the test error? Going beyond bias-variance via ANOVA

1 code implementation • 11 Oct 2020 • Licong Lin, Edgar Dobriban

This leads to discovering the unimodality of variance as a function of the level of parametrization, and to decomposing the variance into that arising from label noise, initialization, and randomness in the training data to understand the sources of the error.

Paper
Code

DeltaGrad: Rapid retraining of machine learning models

1 code implementation • ICML 2020 • Yinjun Wu, Edgar Dobriban, Susan B. Davidson

Machine learning models are not static and may need to be retrained on slightly changed datasets, for instance, with the addition or deletion of a set of data points.

BIG-bench Machine Learning

Paper
Code

Provable tradeoffs in adversarially robust classification

no code implementations • 9 Jun 2020 • Edgar Dobriban, Hamed Hassani, David Hong, Alexander Robey

It is well known that machine learning methods can be vulnerable to adversarially-chosen perturbations of their inputs.

Classification General Classification +1

Paper
Add Code

The Implicit Regularization of Stochastic Gradient Flow for Least Squares

no code implementations • ICML 2020 • Alnur Ali, Edgar Dobriban, Ryan J. Tibshirani

We study the implicit regularization of mini-batch stochastic gradient descent, when applied to the fundamental problem of least squares regression.

regression

Paper
Add Code

Optimal Iterative Sketching with the Subsampled Randomized Hadamard Transform

no code implementations • 3 Feb 2020 • Jonathan Lacotte, Sifan Liu, Edgar Dobriban, Mert Pilanci

These show that the convergence rate for Haar and randomized Hadamard matrices are identical, and asymptotically improve upon Gaussian random projections.

Dimensionality Reduction

Paper
Add Code

Implicit Regularization and Convergence for Weight Normalization

no code implementations • NeurIPS 2020 • Xiaoxia Wu, Edgar Dobriban, Tongzheng Ren, Shanshan Wu, Zhiyuan Li, Suriya Gunasekar, Rachel Ward, Qiang Liu

For certain stepsizes of g and w , we show that they can converge close to the minimum norm solution.

Paper
Add Code

Ridge Regression: Structure, Cross-Validation, and Sketching

2 code implementations • ICLR 2020 • Sifan Liu, Edgar Dobriban

(2) how to correctly use cross-validation to choose the regularization parameter?

regression

Paper
Code

A Group-Theoretic Framework for Data Augmentation

1 code implementation • NeurIPS 2020 • Shuxiao Chen, Edgar Dobriban, Jane H Lee

Data augmentation is a widely used trick when training deep neural networks: in addition to the original data, properly transformed data are also added to the training set.

Data Augmentation Image Classification

Paper
Code

WONDER: Weighted one-shot distributed ridge regression in high dimensions

1 code implementation • 22 Mar 2019 • Edgar Dobriban, Yue Sheng

Here we study a fundamental and highly important problem in this area: How to do ridge regression in a distributed computing environment?

Distributed Computing regression +2

Paper
Code

Asymptotics for Sketching in Least Squares Regression

1 code implementation • NeurIPS 2019 • Edgar Dobriban, Sifan Liu

We consider a least squares regression problem where the data has been generated from a linear model, and we are interested to learn the unknown regression parameters.

Dimensionality Reduction regression

Paper
Code

Distributed linear regression by averaging

1 code implementation • 30 Sep 2018 • Edgar Dobriban, Yue Sheng

Here we study the performance loss in estimation, test error, and confidence interval length in high dimensions, where the number of parameters is comparable to the training data size.

regression

Paper
Code

Robust Inference Under Heteroskedasticity via the Hadamard Estimator

1 code implementation • 1 Jul 2018 • Edgar Dobriban, Weijie J. Su

In this paper, we propose methods that are robust to large and unequal noise in different observational units (i. e., heteroskedasticity) for statistical inference in linear regression.

Statistics Theory Methodology Statistics Theory

Paper
Code

Flexible Multiple Testing with the FACT Algorithm

1 code implementation • 26 Jun 2018 • Edgar Dobriban

Modern high-throughput science often leads to multiple testing problems: researchers test many hypotheses, wishing to find the significant discoveries.

Methodology

Paper
Code

Deterministic parallel analysis

1 code implementation • 11 Nov 2017 • Edgar Dobriban, Art B. Owen

This paper presents a deterministic version of PA (DPA), which is faster and more reproducible than PA. We show that DPA selects large factors and does not select small factors just like [Dobriban, 2017] shows for PA.

Methodology

Paper
Code

Permutation methods for factor analysis and PCA

1 code implementation • 2 Oct 2017 • Edgar Dobriban

In this paper, we show that the parallel analysis permutation method consistently selects the large components in certain high-dimensional factor models.

Statistics Theory Methodology Statistics Theory

Paper
Code

$e$PCA: High Dimensional Exponential Family PCA

1 code implementation • 17 Nov 2016 • Lydia T. Liu, Edgar Dobriban, Amit Singer

We develop $e$PCA (exponential family PCA), a new methodology for PCA on exponential family distributions.

Methodology

Paper
Code

High-Dimensional Asymptotics of Prediction: Ridge Regression and Classification

1 code implementation • 10 Jul 2015 • Edgar Dobriban, Stefan Wager

We provide a unified analysis of the predictive risk of ridge regression and regularized discriminant analysis in a dense random effects model.

Classification General Classification +2

Paper
Code

Efficient Computation of Limit Spectra of Sample Covariance Matrices

1 code implementation • 7 Jul 2015 • Edgar Dobriban

Asymptotically, as $n, p \to \infty$ with $p/n \to \gamma$, there is a deterministic mapping from the population spectral distribution (PSD) to the empirical spectral distribution (ESD) of the eigenvalues.

Numerical Analysis Probability

Paper
Code

Optimal Multiple Testing Under a Gaussian Prior on the Effect Sizes

4 code implementations • 12 Apr 2015 • Edgar Dobriban, Kristen Fortney, Stuart K. Kim, Art B. Owen

For a Gaussian prior on effect sizes, we show that finding the optimal weights is a non-convex problem.

Methodology

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.