Search Results for author: Fanny Yang

Found 32 papers, 17 papers with code

Atmospheric Transport Modeling of CO$_2$ with Neural Networks

2 code implementations20 Aug 2024 Vitus Benson, Ana Bastos, Christian Reimers, Alexander J. Winkler, Fanny Yang, Markus Reichstein

Accurately describing the distribution of CO$_2$ in the atmosphere with atmospheric tracer transport models is essential for greenhouse gas monitoring and verification support systems to aid implementation of international climate agreements.

Strong Copyright Protection for Language Models via Adaptive Model Fusion

no code implementations29 Jul 2024 Javier Abad, Konstantin Donhauser, Francesco Pinto, Fanny Yang

The risk of language models unintentionally reproducing copyrighted material from their training data has led to the development of various protective measures.

Code Generation Memorization

Robust Mixture Learning when Outliers Overwhelm Small Groups

no code implementations22 Jul 2024 Daniil Dmitriev, Rares-Darius Buhai, Stefan Tiegel, Alexander Wolters, Gleb Novikov, Amartya Sanyal, David Steurer, Fanny Yang

We study the problem of estimating the means of well-separated mixtures when an adversary may add arbitrary outliers.

Detecting critical treatment effect bias in small subgroups

1 code implementation29 Apr 2024 Piersilvio De Bartolomeis, Javier Abad, Konstantin Donhauser, Fanny Yang

Randomized trials are considered the gold standard for making informed decisions in medicine, yet they often lack generalizability to the patient populations in clinical practice.

Benchmarking Decision Making +1

Privacy-preserving data release leveraging optimal transport and particle gradient descent

1 code implementation31 Jan 2024 Konstantin Donhauser, Javier Abad, Neha Hulkund, Fanny Yang

We present a novel approach for differentially private data synthesis of protected tabular datasets, a relevant task in highly sensitive domains such as healthcare and government.

Privacy Preserving

Hidden yet quantifiable: A lower bound for confounding strength using randomized trials

2 code implementations6 Dec 2023 Piersilvio De Bartolomeis, Javier Abad, Konstantin Donhauser, Fanny Yang

Further, we show how our lower bound can correctly identify the absence and presence of unobserved confounding in a real-world setting.

valid

Can semi-supervised learning use all the data effectively? A lower bound perspective

no code implementations NeurIPS 2023 Alexandru Ţifrea, Gizem Yüce, Amartya Sanyal, Fanny Yang

Prior works have shown that semi-supervised learning algorithms can leverage unlabeled data to improve over the labeled sample complexity of supervised learning (SL) algorithms.

How robust accuracy suffers from certified training with convex relaxations

no code implementations12 Jun 2023 Piersilvio De Bartolomeis, Jacob Clarysse, Amartya Sanyal, Fanny Yang

In this paper, we systematically compare the standard and robust error of these two robust training paradigms across multiple computer vision tasks.

PILLAR: How to make semi-private learning more effective

1 code implementation6 Jun 2023 Francesco Pinto, Yaxi Hu, Fanny Yang, Amartya Sanyal

In Semi-Supervised Semi-Private (SP) learning, the learner has access to both public unlabelled and private labelled data.

Strong inductive biases provably prevent harmless interpolation

1 code implementation18 Jan 2023 Michael Aerni, Marco Milanta, Konstantin Donhauser, Fanny Yang

Classical wisdom suggests that estimators should avoid fitting noise to achieve good generalization.

Inductive Bias

Tight bounds for maximum $\ell_1$-margin classifiers

no code implementations7 Dec 2022 Stefan Stojanovic, Konstantin Donhauser, Fanny Yang

In particular, for the noiseless setting, we prove tight upper and lower bounds for the prediction error that match existing rates of order $\frac{\|w^*\|_1^{2/3}}{n^{1/3}}$ for general ground truths.

Margin-based sampling in high dimensions: When being active is less efficient than staying passive

no code implementations1 Dec 2022 Alexandru Tifrea, Jacob Clarysse, Fanny Yang

It is widely believed that given the same labeling budget, active learning (AL) algorithms like margin-based active learning achieve better predictive performance than passive learning (PL), albeit at a higher computational cost.

Active Learning

How unfair is private learning ?

no code implementations8 Jun 2022 Amartya Sanyal, Yaxi Hu, Fanny Yang

As machine learning algorithms are deployed on sensitive data in critical decision making processes, it is becoming increasingly important that they are also private and fair.

Decision Making Fairness

Provable concept learning for interpretable predictions using variational autoencoders

2 code implementations1 Apr 2022 Armeen Taeb, Nicolo Ruggeri, Carina Schnuck, Fanny Yang

In safety-critical applications, practitioners are reluctant to trust neural networks when no interpretable explanations are available.

Variational Inference

Fast Rates for Noisy Interpolation Require Rethinking the Effects of Inductive Bias

1 code implementation7 Mar 2022 Konstantin Donhauser, Nicolo Ruggeri, Stefan Stojanovic, Fanny Yang

Good generalization performance on high-dimensional data crucially hinges on a simple structure of the ground truth and a corresponding strong inductive bias of the estimator.

Inductive Bias valid

Why adversarial training can hurt robust accuracy

no code implementations3 Mar 2022 Jacob Clarysse, Julia Hörmann, Fanny Yang

Machine learning classifiers with high test accuracy often perform poorly under adversarial attacks.

Tight bounds for minimum l1-norm interpolation of noisy data

1 code implementation10 Nov 2021 Guillaume Wang, Konstantin Donhauser, Fanny Yang

We provide matching upper and lower bounds of order $\sigma^2/\log(d/n)$ for the prediction error of the minimum $\ell_1$-norm interpolator, a. k. a.

Self-supervised Reinforcement Learning with Independently Controllable Subgoals

no code implementations9 Sep 2021 Andrii Zadaianchuk, Georg Martius, Fanny Yang

We propose a novel self-supervised agent that estimates relations between environment components and uses them to independently control different parts of the environment state.

reinforcement-learning Reinforcement Learning (RL)

Interpolation can hurt robust generalization even when there is no noise

2 code implementations NeurIPS 2021 Konstantin Donhauser, Alexandru Ţifrea, Michael Aerni, Reinhard Heckel, Fanny Yang

Numerous recent works show that overparameterization implicitly reduces variance for min-norm interpolators and max-margin classifiers.

regression

Maximizing the robust margin provably overfits on noiseless data

1 code implementation ICML Workshop AML 2021 Konstantin Donhauser, Alexandru Tifrea, Michael Aerni, Reinhard Heckel, Fanny Yang

Numerous recent works show that overparameterization implicitly reduces variance, suggesting vanishing benefits for explicit regularization in high dimensions.

Attribute

Semi-supervised novelty detection using ensembles with regularized disagreement

1 code implementation10 Dec 2020 Alexandru Ţifrea, Eric Stavarache, Fanny Yang

Deep neural networks often predict samples with high confidence even when they come from unseen classes and should instead be flagged for expert evaluation.

 Ranked #1 on Out-of-Distribution Detection on CIFAR-10 vs CIFAR-10.1 (using extra training data)

Novelty Detection Out-of-Distribution Detection +1

Understanding and Mitigating the Tradeoff Between Robustness and Accuracy

1 code implementation ICML 2020 Aditi Raghunathan, Sang Michael Xie, Fanny Yang, John Duchi, Percy Liang

In this work, we precisely characterize the effect of augmentation on the standard error in linear regression when the optimal linear predictor has zero standard and robust error.

regression

When Covariate-shifted Data Augmentation Increases Test Error And How to Fix It

no code implementations25 Sep 2019 Sang Michael Xie*, Aditi Raghunathan*, Fanny Yang, John C. Duchi, Percy Liang

Empirically, data augmentation sometimes improves and sometimes hurts test error, even when only adding points with labels from the true conditional distribution that the hypothesis class is expressive enough to fit.

Data Augmentation regression

Invariance-inducing regularization using worst-case transformations suffices to boost accuracy and spatial robustness

no code implementations NeurIPS 2019 Fanny Yang, Zuowen Wang, Christina Heinze-Deml

This work provides theoretical and empirical evidence that invariance-inducing regularizers can increase predictive accuracy for worst-case spatial transformations (spatial robustness).

Adversarial Training Can Hurt Generalization

no code implementations ICML Workshop Deep_Phenomen 2019 Aditi Raghunathan, Sang Michael Xie, Fanny Yang, John C. Duchi, Percy Liang

While adversarial training can improve robust accuracy (against an adversary), it sometimes hurts standard accuracy (when there is no adversary).

Regularized Learning for Domain Adaptation under Label Shifts

2 code implementations ICLR 2019 Kamyar Azizzadenesheli, Anqi Liu, Fanny Yang, Animashree Anandkumar

We derive a generalization bound for the classifier on the target domain which is independent of the (ambient) data dimensions, and instead only depends on the complexity of the function class.

Domain Adaptation

Online control of the false discovery rate with decaying memory

1 code implementation NeurIPS 2017 Aaditya Ramdas, Fanny Yang, Martin J. Wainwright, Michael. I. Jordan

In the online multiple testing problem, p-values corresponding to different null hypotheses are observed one by one, and the decision of whether or not to reject the current hypothesis must be made immediately, after which the next p-value is observed.

Unity

Early stopping for kernel boosting algorithms: A general analysis with localized complexities

no code implementations NeurIPS 2017 Yuting Wei, Fanny Yang, Martin J. Wainwright

Early stopping of iterative algorithms is a widely-used form of regularization in statistics, commonly used in conjunction with boosting and related gradient-type algorithms.

A framework for Multi-A(rmed)/B(andit) testing with online FDR control

1 code implementation NeurIPS 2017 Fanny Yang, Aaditya Ramdas, Kevin Jamieson, Martin J. Wainwright

We propose an alternative framework to existing setups for controlling false alarms when multiple A/B tests are run over time.

valid

Statistical and Computational Guarantees for the Baum-Welch Algorithm

no code implementations27 Dec 2015 Fanny Yang, Sivaraman Balakrishnan, Martin J. Wainwright

By exploiting this characterization, we provide non-asymptotic finite sample guarantees on the Baum-Welch updates, guaranteeing geometric convergence to a small ball of radius on the order of the minimax rate around a global optimum.

Econometrics speech-recognition +3

Cannot find the paper you are looking for? You can Submit a new open access paper.