Search Results for author: Edgar Dobriban

Found 55 papers, 34 papers with code

Inference in Randomized Least Squares and PCA via Normality of Quadratic Forms

1 code implementation1 Apr 2024 Leda Wang, Zhixiang Zhang, Edgar Dobriban

To our knowledge, no comparable methods are available for SSE and for SRHT in PCA.

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models

1 code implementation28 Mar 2024 Patrick Chao, Edoardo Debenedetti, Alexander Robey, Maksym Andriushchenko, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George J. Pappas, Florian Tramer, Hamed Hassani, Eric Wong

To address these challenges, we introduce JailbreakBench, an open-sourced benchmark with the following components: (1) an evolving repository of state-of-the-art adversarial prompts, which we refer to as jailbreak artifacts; (2) a jailbreaking dataset comprising 100 behaviors -- both original and sourced from prior work -- which align with OpenAI's usage policies; (3) a standardized evaluation framework that includes a clearly defined threat model, system prompts, chat templates, and scoring functions; and (4) a leaderboard that tracks the performance of attacks and defenses for various LLMs.

Minimax Optimal Fair Classification with Bounded Demographic Disparity

1 code implementation27 Mar 2024 Xianli Zeng, Guang Cheng, Edgar Dobriban

Mitigating the disparate impact of statistical machine learning methods is crucial for ensuring fairness.

Binary Classification Classification +1

Bayes-Optimal Fair Classification with Linear Disparity Constraints via Pre-, In-, and Post-processing

1 code implementation5 Feb 2024 Xianli Zeng, Guang Cheng, Edgar Dobriban

To address this, we develop methods for Bayes-optimal fair classification, aiming to minimize classification error subject to given group fairness constraints.

Attribute Classification +2

SymmPI: Predictive Inference for Data with Group Symmetries

1 code implementation26 Dec 2023 Edgar Dobriban, Mengxin Yu

Methods for predictive inference have been developed under a variety of assumptions, often -- for instance, in standard conformal prediction -- relying on the invariance of the distribution of the data under special groups of transformations such as permutation groups.

Conformal Prediction valid

PAC Prediction Sets Under Label Shift

1 code implementation19 Oct 2023 Wenwen Si, Sangdon Park, Insup Lee, Edgar Dobriban, Osbert Bastani

We propose a novel algorithm for constructing prediction sets with PAC guarantees in the label shift setting.

Uncertainty Quantification

Jailbreaking Black Box Large Language Models in Twenty Queries

1 code implementation12 Oct 2023 Patrick Chao, Alexander Robey, Edgar Dobriban, Hamed Hassani, George J. Pappas, Eric Wong

PAIR -- which is inspired by social engineering attacks -- uses an attacker LLM to automatically generate jailbreaks for a separate targeted LLM without human intervention.

A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks

no code implementations11 Oct 2023 Behrad Moniri, Donghwan Lee, Hamed Hassani, Edgar Dobriban

However, with a constant gradient descent step size, this spike only carries information from the linear component of the target function and therefore learning non-linear components is impossible.

Statistical Estimation Under Distribution Shift: Wasserstein Perturbations and Minimax Theory

2 code implementations3 Aug 2023 Patrick Chao, Edgar Dobriban

Under a squared loss for mean estimation and prediction error in linear regression, we find the exact minimax risk, a least favorable perturbation, and show that the sample mean and least squares estimators are respectively optimal.

Density Estimation regression

Efficient and Multiply Robust Risk Estimation under General Forms of Dataset Shift

no code implementations28 Jun 2023 Hongxiang Qiu, Eric Tchetgen Tchetgen, Edgar Dobriban

Despite extensive literature on dataset shift, limited works address how to efficiently use the auxiliary populations to improve the accuracy of risk evaluation for a given machine learning task in the target population.

Domain Adaptation Transfer Learning

Optimal Heterogeneous Collaborative Linear Regression and Contextual Bandits

no code implementations9 Jun 2023 Xinmeng Huang, Kan Xu, Donghwan Lee, Hamed Hassani, Hamsa Bastani, Edgar Dobriban

MOLAR improves the dependence of the estimation error on the data dimension, compared to independent least squares estimates.

Multi-Armed Bandits regression

Sharp-SSL: Selective high-dimensional axis-aligned random projections for semi-supervised learning

no code implementations18 Apr 2023 Tengyao Wang, Edgar Dobriban, Milana Gataric, Richard J. Samworth

We propose a new method for high-dimensional semi-supervised learning problems based on the careful aggregation of the results of a low-dimensional procedure applied to many axis-aligned random projections of the data.

Demystifying Disagreement-on-the-Line in High Dimensions

1 code implementation31 Jan 2023 Donghwan Lee, Behrad Moniri, Xinmeng Huang, Edgar Dobriban, Hamed Hassani

Evaluating the performance of machine learning models under distribution shift is challenging, especially when we only have unlabeled data from the shifted (target) domain, along with labeled data from the original (source) domain.

Vocal Bursts Intensity Prediction

Conformal Frequency Estimation using Discrete Sketched Data with Coverage for Distinct Queries

1 code implementation9 Nov 2022 Matteo Sesia, Stefano Favaro, Edgar Dobriban

This paper develops conformal inference methods to construct a confidence interval for the frequency of a queried object in a very large discrete data set, based on a sketch with a lower memory footprint.

valid

PAC Prediction Sets for Meta-Learning

no code implementations6 Jul 2022 Sangdon Park, Edgar Dobriban, Insup Lee, Osbert Bastani

Uncertainty quantification is a key component of machine learning models targeted at safety-critical systems such as in healthcare or autonomous vehicles.

Autonomous Vehicles Meta-Learning +1

Pursuit of a Discriminative Representation for Multiple Subspaces via Sequential Games

1 code implementation18 Jun 2022 Druv Pai, Michael Psenka, Chih-Yuan Chiu, Manxi Wu, Edgar Dobriban, Yi Ma

We consider the problem of learning discriminative representations for data in a high-dimensional space with distribution supported on or around multiple low-dimensional linear subspaces.

Representation Learning

Unified Fourier-based Kernel and Nonlinearity Design for Equivariant Networks on Homogeneous Spaces

no code implementations16 Jun 2022 Yinshuang Xu, Jiahui Lei, Edgar Dobriban, Kostas Daniilidis

We present a unified derivation of kernels via the Fourier domain by leveraging the sparsity of Fourier coefficients of the lifted feature fields.

Point Cloud Classification

Memory Classifiers: Two-stage Classification for Robustness in Machine Learning

no code implementations10 Jun 2022 Souradeep Dutta, Yahan Yang, Elena Bernardis, Edgar Dobriban, Insup Lee

We propose a new method for classification which can improve robustness to distribution shifts, by combining expert knowledge about the ``high-level" structure of the data with standard classifiers.

BIG-bench Machine Learning Classification +3

Collaborative Learning of Discrete Distributions under Heterogeneity and Communication Constraints

no code implementations1 Jun 2022 Xinmeng Huang, Donghwan Lee, Edgar Dobriban, Hamed Hassani

In modern machine learning, users often have to collaborate to learn the distribution of the data.

PAC-Wrap: Semi-Supervised PAC Anomaly Detection

no code implementations22 May 2022 Shuo Li, Xiayan Ji, Edgar Dobriban, Oleg Sokolsky, Insup Lee

Anomaly detection is essential for preventing hazardous outcomes for safety-critical applications like autonomous driving.

Autonomous Driving Unsupervised Anomaly Detection

Fair Bayes-Optimal Classifiers Under Predictive Parity

1 code implementation15 May 2022 Xianli Zeng, Edgar Dobriban, Guang Cheng

This paper considers predictive parity, which requires equalizing the probability of success given a positive prediction among different protected groups.

SE(3)-Equivariant Attention Networks for Shape Reconstruction in Function Space

no code implementations5 Apr 2022 Evangelos Chatzipantazis, Stefanos Pertigkiozoglou, Edgar Dobriban, Kostas Daniilidis

In contrast to previous shape reconstruction methods that align the input to a regular grid, we operate directly on the irregular point cloud.

3D Shape Reconstruction

Prediction Sets Adaptive to Unknown Covariate Shift

1 code implementation11 Mar 2022 Hongxiang Qiu, Edgar Dobriban, Eric Tchetgen Tchetgen

Predicting sets of outcomes -- instead of unique outcomes -- is a promising solution to uncertainty quantification in statistical learning.

Uncertainty Quantification

T-Cal: An optimal test for the calibration of predictive models

1 code implementation3 Mar 2022 Donghwan Lee, Xinmeng Huang, Hamed Hassani, Edgar Dobriban

We find that detecting mis-calibration is only possible when the conditional probabilities of the classes are sufficiently smooth functions of the predictions.

Bayes-Optimal Classifiers under Group Fairness

1 code implementation20 Feb 2022 Xianli Zeng, Edgar Dobriban, Guang Cheng

Machine learning algorithms are becoming integrated into more and more high-stakes decision-making processes, such as in social welfare issues.

BIG-bench Machine Learning Decision Making +1

Learning Augmentation Distributions using Transformed Risk Minimization

no code implementations16 Nov 2021 Evangelos Chatzipantazis, Stefanos Pertigkiozoglou, Kostas Daniilidis, Edgar Dobriban

We propose a new \emph{Transformed Risk Minimization} (TRM) framework as an extension of classical risk minimization.

Rotated MNIST

Comparing Classes of Estimators: When does Gradient Descent Beat Ridge Regression in Linear Models?

1 code implementation26 Aug 2021 Dominic Richards, Edgar Dobriban, Patrick Rebeschini

Methods for learning from data depend on various types of tuning parameters, such as penalization strength or step size.

regression Unity

PAC Prediction Sets Under Covariate Shift

1 code implementation ICLR 2022 Sangdon Park, Edgar Dobriban, Insup Lee, Osbert Bastani

Our approach focuses on the setting where there is a covariate shift from the source distribution (where we have labeled training examples) to the target distribution (for which we want to quantify uncertainty).

Uncertainty Quantification

Understanding Generalization in Adversarial Training via the Bias-Variance Decomposition

1 code implementation17 Mar 2021 Yaodong Yu, Zitong Yang, Edgar Dobriban, Jacob Steinhardt, Yi Ma

To investigate this gap, we decompose the test risk into its bias and variance components and study their behavior as a function of adversarial training perturbation radii ($\varepsilon$).

Optimal Iterative Sketching Methods with the Subsampled Randomized Hadamard Transform

no code implementations NeurIPS 2020 Jonathan Lacotte, Sifan Liu, Edgar Dobriban, Mert Pilanci

These show that the convergence rate for Haar and randomized Hadamard matrices are identical, and asymptotically improve upon Gaussian random projections.

Dimensionality Reduction

Sparse sketches with small inversion bias

no code implementations21 Nov 2020 Michał Dereziński, Zhenyu Liao, Edgar Dobriban, Michael W. Mahoney

For a tall $n\times d$ matrix $A$ and a random $m\times n$ sketching matrix $S$, the sketched estimate of the inverse covariance matrix $(A^\top A)^{-1}$ is typically biased: $E[(\tilde A^\top\tilde A)^{-1}]\ne(A^\top A)^{-1}$, where $\tilde A=SA$.

Distributed Optimization

What causes the test error? Going beyond bias-variance via ANOVA

1 code implementation11 Oct 2020 Licong Lin, Edgar Dobriban

This leads to discovering the unimodality of variance as a function of the level of parametrization, and to decomposing the variance into that arising from label noise, initialization, and randomness in the training data to understand the sources of the error.

DeltaGrad: Rapid retraining of machine learning models

1 code implementation ICML 2020 Yinjun Wu, Edgar Dobriban, Susan B. Davidson

Machine learning models are not static and may need to be retrained on slightly changed datasets, for instance, with the addition or deletion of a set of data points.

BIG-bench Machine Learning

Provable tradeoffs in adversarially robust classification

no code implementations9 Jun 2020 Edgar Dobriban, Hamed Hassani, David Hong, Alexander Robey

It is well known that machine learning methods can be vulnerable to adversarially-chosen perturbations of their inputs.

Classification General Classification +1

The Implicit Regularization of Stochastic Gradient Flow for Least Squares

no code implementations ICML 2020 Alnur Ali, Edgar Dobriban, Ryan J. Tibshirani

We study the implicit regularization of mini-batch stochastic gradient descent, when applied to the fundamental problem of least squares regression.

regression

Optimal Iterative Sketching with the Subsampled Randomized Hadamard Transform

no code implementations3 Feb 2020 Jonathan Lacotte, Sifan Liu, Edgar Dobriban, Mert Pilanci

These show that the convergence rate for Haar and randomized Hadamard matrices are identical, and asymptotically improve upon Gaussian random projections.

Dimensionality Reduction

A Group-Theoretic Framework for Data Augmentation

1 code implementation NeurIPS 2020 Shuxiao Chen, Edgar Dobriban, Jane H Lee

Data augmentation is a widely used trick when training deep neural networks: in addition to the original data, properly transformed data are also added to the training set.

Data Augmentation Image Classification

WONDER: Weighted one-shot distributed ridge regression in high dimensions

1 code implementation22 Mar 2019 Edgar Dobriban, Yue Sheng

Here we study a fundamental and highly important problem in this area: How to do ridge regression in a distributed computing environment?

Distributed Computing regression +2

Asymptotics for Sketching in Least Squares Regression

1 code implementation NeurIPS 2019 Edgar Dobriban, Sifan Liu

We consider a least squares regression problem where the data has been generated from a linear model, and we are interested to learn the unknown regression parameters.

Dimensionality Reduction regression

Distributed linear regression by averaging

1 code implementation30 Sep 2018 Edgar Dobriban, Yue Sheng

Here we study the performance loss in estimation, test error, and confidence interval length in high dimensions, where the number of parameters is comparable to the training data size.

regression

Robust Inference Under Heteroskedasticity via the Hadamard Estimator

1 code implementation1 Jul 2018 Edgar Dobriban, Weijie J. Su

In this paper, we propose methods that are robust to large and unequal noise in different observational units (i. e., heteroskedasticity) for statistical inference in linear regression.

Statistics Theory Methodology Statistics Theory

Flexible Multiple Testing with the FACT Algorithm

1 code implementation26 Jun 2018 Edgar Dobriban

Modern high-throughput science often leads to multiple testing problems: researchers test many hypotheses, wishing to find the significant discoveries.

Methodology

Deterministic parallel analysis

1 code implementation11 Nov 2017 Edgar Dobriban, Art B. Owen

This paper presents a deterministic version of PA (DPA), which is faster and more reproducible than PA. We show that DPA selects large factors and does not select small factors just like [Dobriban, 2017] shows for PA.

Methodology

Permutation methods for factor analysis and PCA

1 code implementation2 Oct 2017 Edgar Dobriban

In this paper, we show that the parallel analysis permutation method consistently selects the large components in certain high-dimensional factor models.

Statistics Theory Methodology Statistics Theory

$e$PCA: High Dimensional Exponential Family PCA

1 code implementation17 Nov 2016 Lydia T. Liu, Edgar Dobriban, Amit Singer

We develop $e$PCA (exponential family PCA), a new methodology for PCA on exponential family distributions.

Methodology

High-Dimensional Asymptotics of Prediction: Ridge Regression and Classification

1 code implementation10 Jul 2015 Edgar Dobriban, Stefan Wager

We provide a unified analysis of the predictive risk of ridge regression and regularized discriminant analysis in a dense random effects model.

Classification General Classification +2

Efficient Computation of Limit Spectra of Sample Covariance Matrices

1 code implementation7 Jul 2015 Edgar Dobriban

Asymptotically, as $n, p \to \infty$ with $p/n \to \gamma$, there is a deterministic mapping from the population spectral distribution (PSD) to the empirical spectral distribution (ESD) of the eigenvalues.

Numerical Analysis Probability

Optimal Multiple Testing Under a Gaussian Prior on the Effect Sizes

4 code implementations12 Apr 2015 Edgar Dobriban, Kristen Fortney, Stuart K. Kim, Art B. Owen

For a Gaussian prior on effect sizes, we show that finding the optimal weights is a non-convex problem.

Methodology

Cannot find the paper you are looking for? You can Submit a new open access paper.