Search Results for author: Julia Kempe

Found 25 papers, 10 papers with code

Flavors of Margin: Implicit Bias of Steepest Descent in Homogeneous Neural Networks

no code implementations29 Oct 2024 Nikolaos Tsilivis, Gal Vardi, Julia Kempe

We study the implicit bias of the general family of steepest descent algorithms, which includes gradient descent, sign descent and coordinate descent, in deep homogeneous neural networks.

On the Geometry of Regularization in Adversarial Training: High-Dimensional Asymptotics and Generalization Bounds

no code implementations21 Oct 2024 Matteo Vilucchio, Nikolaos Tsilivis, Bruno Loureiro, Julia Kempe

Indeed, controlling the complexity of the model class is particularly important when data is scarce, noisy or contaminated, as it translates a statistical belief on the underlying structure of the data.

Binary Classification Generalization Bounds

Emergent properties with repeated examples

no code implementations9 Oct 2024 François Charton, Julia Kempe

We study the performance of transformers as a function of the number of repetitions of training examples with algorithmically generated datasets.

Diversity Memorization

Strong Model Collapse

no code implementations7 Oct 2024 Elvis Dohmatob, Yunzhen Feng, Arjun Subramonian, Julia Kempe

Within the scaling laws paradigm, which underpins the training of large neural networks like ChatGPT and Llama, we consider a supervised regression setting and establish the existance of a strong form of the model collapse phenomenon, a critical performance degradation due to synthetic data in the training corpus.

Beyond Model Collapse: Scaling Up with Synthesized Data Requires Verification

no code implementations11 Jun 2024 Yunzhen Feng, Elvis Dohmatob, Pu Yang, Francois Charton, Julia Kempe

Large Language Models (LLM) are increasingly trained on data generated by other LLM, either because generated text and images become part of the pre-training corpus, or because synthetized data is used as a replacement for expensive human-annotation.

News Summarization

The Price of Implicit Bias in Adversarially Robust Generalization

no code implementations7 Jun 2024 Nikolaos Tsilivis, Natalie Frank, Nathan Srebro, Julia Kempe

We study the implicit bias of optimization in robust empirical risk minimization (robust ERM) and its connection with robust generalization.

Iteration Head: A Mechanistic Study of Chain-of-Thought

1 code implementation4 Jun 2024 Vivien Cabannes, Charles Arnal, Wassim Bouaziz, Alice Yang, Francois Charton, Julia Kempe

Chain-of-Thought (CoT) reasoning is known to improve Large Language Models both empirically and in terms of theoretical approximation power.

Attacking Bayes: On the Adversarial Robustness of Bayesian Neural Networks

no code implementations27 Apr 2024 Yunzhen Feng, Tim G. J. Rudner, Nikolaos Tsilivis, Julia Kempe

Adversarial examples have been shown to cause neural networks to fail on a wide range of vision and language tasks, but recent work has claimed that Bayesian neural networks (BNNs) are inherently robust to adversarial perturbations.

Adversarial Robustness Semantic Shift Detection

DRoP: Distributionally Robust Pruning

1 code implementation8 Apr 2024 Artem Vysogorets, Kartik Ahuja, Julia Kempe

In the era of exceptionally data-hungry models, careful selection of the training data is essential to mitigate the extensive costs of deep learning.

Fairness

Model Collapse Demystified: The Case of Regression

no code implementations12 Feb 2024 Elvis Dohmatob, Yunzhen Feng, Julia Kempe

In the era of proliferation of large language and image generation models, the phenomenon of "model collapse" refers to the situation whereby as a model is trained recursively on data generated from previous generations of itself over time, its performance degrades until the model eventually becomes completely useless, i. e the model collapses.

Image Generation regression

A Tale of Tails: Model Collapse as a Change of Scaling Laws

no code implementations10 Feb 2024 Elvis Dohmatob, Yunzhen Feng, Pu Yang, Francois Charton, Julia Kempe

We discover a wide range of decay phenomena, analyzing loss of scaling, shifted scaling with number of generations, the ''un-learning" of skills, and grokking when mixing human and synthesized data.

Language Modelling Large Language Model +1

Deconstructing the Goldilocks Zone of Neural Network Initialization

1 code implementation5 Feb 2024 Artem Vysogorets, Anna Dawid, Julia Kempe

The second-order properties of the training loss have a massive impact on the optimization dynamics of deep learning models.

Discovering Galaxy Features via Dataset Distillation

1 code implementation29 Nov 2023 Haowen Guan, Xuan Zhao, Zishi Wang, Zhiyang Li, Julia Kempe

In many applications, Neural Nets (NNs) have classification performance on par or even exceeding human capacity.

Dataset Distillation

On the Robustness of Neural Collapse and the Neural Collapse of Robustness

1 code implementation13 Nov 2023 Jingtong Su, Ya Shi Zhang, Nikolaos Tsilivis, Julia Kempe

We further analyze the geometry of networks that are optimized to be robust against adversarial perturbations of the input, and find that Neural Collapse is a pervasive phenomenon in these cases as well, with clean and perturbed representations forming aligned simplices, and giving rise to a robust simple nearest-neighbor classifier.

Kernels, Data & Physics

no code implementations5 Jul 2023 Francesco Cagnetta, Deborah Oliveira, Mahalakshmi Sabanayagam, Nikolaos Tsilivis, Julia Kempe

Lecture notes from the course given by Professor Julia Kempe at the summer school "Statistical physics of Machine Learning" in Les Houches.

Adversarial Robustness Inductive Bias

Wavelets Beat Monkeys at Adversarial Robustness

no code implementations19 Apr 2023 Jingtong Su, Julia Kempe

2) Replacing the front-end VOneBlock by an off-the-shelf parameter-free Scatternet followed by simple uniform Gaussian noise can achieve much more substantial adversarial robustness without adversarial training.

Adversarial Attack Adversarial Robustness

What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness?

1 code implementation11 Oct 2022 Nikolaos Tsilivis, Julia Kempe

The adversarial vulnerability of neural nets, and subsequent techniques to create robust models have attracted significant attention; yet we still lack a full understanding of this phenomenon.

Adversarial Robustness

ImpressLearn: Continual Learning via Combined Task Impressions

no code implementations5 Oct 2022 Dhrupad Bhardwaj, Julia Kempe, Artem Vysogorets, Angela M. Teng, Evaristus C. Ezekwem

Starting from existing work on network masking (Wortsman et al., 2020), we show that simply learning a linear combination of a small number of task-specific supermasks (impressions) on a randomly initialized backbone network is sufficient to both retain accuracy on previously learned tasks, as well as achieve high accuracy on unseen tasks.

Continual Learning Image Classification +1

Can we achieve robustness from data alone?

1 code implementation24 Jul 2022 Nikolaos Tsilivis, Jingtong Su, Julia Kempe

In parallel, we revisit prior work that also focused on the problem of data optimization for robust classification \citep{Ily+19}, and show that being robust to adversarial attacks after standard (gradient descent) training on a suitable dataset is more challenging than previously thought.

Meta-Learning regression +1

The NTK Adversary: An Approach to Adversarial Attacks without any Model Access

no code implementations29 Sep 2021 Nikolaos Tsilivis, Julia Kempe

In particular, in the regime where the Neural Tangent Kernel theory holds, we derive a simple, but powerful strategy for attacking models, which in contrast to prior work, does not require any access to the model under attack, or any trained replica of it for that matter.

Learning Theory

Connectivity Matters: Neural Network Pruning Through the Lens of Effective Sparsity

1 code implementation5 Jul 2021 Artem Vysogorets, Julia Kempe

Neural network pruning is a fruitful area of research with surging interest in high sparsity regimes.

Benchmarking Network Pruning

Quantum random walks - an introductory overview

1 code implementation13 Mar 2003 Julia Kempe

This article aims to provide an introductory survey on quantum random walks.

Quantum Physics Data Structures and Algorithms

Quantum Walks On Graphs

no code implementations18 Dec 2000 Dorit Aharonov, Andris Ambainis, Julia Kempe, Umesh Vazirani

We set the ground for a theory of quantum walks on graphs- the generalization of random walks on finite graphs to the quantum world.

Quantum Physics

Universal simulation of Markovian quantum dynamics

no code implementations15 Aug 2000 Dave Bacon, Andrew M. Childs, Isaac L. Chuang, Julia Kempe, Debbie W. Leung, Xinlan Zhou

Although the conditions for performing arbitrary unitary operations to simulate the dynamics of a closed quantum system are well understood, the same is not true of the more general class of quantum operations (also known as superoperators) corresponding to the dynamics of open quantum systems.

Quantum Physics

Cannot find the paper you are looking for? You can Submit a new open access paper.