no code implementations • 29 Oct 2024 • Nikolaos Tsilivis, Gal Vardi, Julia Kempe

We study the implicit bias of the general family of steepest descent algorithms, which includes gradient descent, sign descent and coordinate descent, in deep homogeneous neural networks.

no code implementations • 21 Oct 2024 • Matteo Vilucchio, Nikolaos Tsilivis, Bruno Loureiro, Julia Kempe

Indeed, controlling the complexity of the model class is particularly important when data is scarce, noisy or contaminated, as it translates a statistical belief on the underlying structure of the data.

no code implementations • 9 Oct 2024 • François Charton, Julia Kempe

We study the performance of transformers as a function of the number of repetitions of training examples with algorithmically generated datasets.

no code implementations • 7 Oct 2024 • Elvis Dohmatob, Yunzhen Feng, Arjun Subramonian, Julia Kempe

Within the scaling laws paradigm, which underpins the training of large neural networks like ChatGPT and Llama, we consider a supervised regression setting and establish the existance of a strong form of the model collapse phenomenon, a critical performance degradation due to synthetic data in the training corpus.

no code implementations • 11 Jun 2024 • Yunzhen Feng, Elvis Dohmatob, Pu Yang, Francois Charton, Julia Kempe

Large Language Models (LLM) are increasingly trained on data generated by other LLM, either because generated text and images become part of the pre-training corpus, or because synthetized data is used as a replacement for expensive human-annotation.

no code implementations • 7 Jun 2024 • Nikolaos Tsilivis, Natalie Frank, Nathan Srebro, Julia Kempe

We study the implicit bias of optimization in robust empirical risk minimization (robust ERM) and its connection with robust generalization.

1 code implementation • 4 Jun 2024 • Vivien Cabannes, Charles Arnal, Wassim Bouaziz, Alice Yang, Francois Charton, Julia Kempe

Chain-of-Thought (CoT) reasoning is known to improve Large Language Models both empirically and in terms of theoretical approximation power.

no code implementations • 27 Apr 2024 • Yunzhen Feng, Tim G. J. Rudner, Nikolaos Tsilivis, Julia Kempe

Adversarial examples have been shown to cause neural networks to fail on a wide range of vision and language tasks, but recent work has claimed that Bayesian neural networks (BNNs) are inherently robust to adversarial perturbations.

1 code implementation • 8 Apr 2024 • Artem Vysogorets, Kartik Ahuja, Julia Kempe

In the era of exceptionally data-hungry models, careful selection of the training data is essential to mitigate the extensive costs of deep learning.

1 code implementation • 14 Mar 2024 • Tim G. J. Rudner, Ya Shi Zhang, Andrew Gordon Wilson, Julia Kempe

Machine learning models often perform poorly under subpopulation shifts in the data distribution.

no code implementations • 12 Feb 2024 • Elvis Dohmatob, Yunzhen Feng, Julia Kempe

In the era of proliferation of large language and image generation models, the phenomenon of "model collapse" refers to the situation whereby as a model is trained recursively on data generated from previous generations of itself over time, its performance degrades until the model eventually becomes completely useless, i. e the model collapses.

no code implementations • 10 Feb 2024 • Elvis Dohmatob, Yunzhen Feng, Pu Yang, Francois Charton, Julia Kempe

We discover a wide range of decay phenomena, analyzing loss of scaling, shifted scaling with number of generations, the ''un-learning" of skills, and grokking when mixing human and synthesized data.

1 code implementation • 5 Feb 2024 • Artem Vysogorets, Anna Dawid, Julia Kempe

The second-order properties of the training loss have a massive impact on the optimization dynamics of deep learning models.

1 code implementation • 29 Nov 2023 • Haowen Guan, Xuan Zhao, Zishi Wang, Zhiyang Li, Julia Kempe

In many applications, Neural Nets (NNs) have classification performance on par or even exceeding human capacity.

no code implementations • 13 Nov 2023 • Jingtong Su, Ya Shi Zhang, Nikolaos Tsilivis, Julia Kempe

Neural Collapse refers to the curious phenomenon in the end of training of a neural network, where feature vectors and classification weights converge to a very simple geometrical arrangement (a simplex).

no code implementations • 5 Jul 2023 • Francesco Cagnetta, Deborah Oliveira, Mahalakshmi Sabanayagam, Nikolaos Tsilivis, Julia Kempe

Lecture notes from the course given by Professor Julia Kempe at the summer school "Statistical physics of Machine Learning" in Les Houches.

no code implementations • 19 Apr 2023 • Jingtong Su, Julia Kempe

2) Replacing the front-end VOneBlock by an off-the-shelf parameter-free Scatternet followed by simple uniform Gaussian noise can achieve much more substantial adversarial robustness without adversarial training.

1 code implementation • 11 Oct 2022 • Nikolaos Tsilivis, Julia Kempe

The adversarial vulnerability of neural nets, and subsequent techniques to create robust models have attracted significant attention; yet we still lack a full understanding of this phenomenon.

no code implementations • 5 Oct 2022 • Dhrupad Bhardwaj, Julia Kempe, Artem Vysogorets, Angela M. Teng, Evaristus C. Ezekwem

Starting from existing work on network masking (Wortsman et al., 2020), we show that simply learning a linear combination of a small number of task-specific supermasks (impressions) on a randomly initialized backbone network is sufficient to both retain accuracy on previously learned tasks, as well as achieve high accuracy on unseen tasks.

1 code implementation • 24 Jul 2022 • Nikolaos Tsilivis, Jingtong Su, Julia Kempe

In parallel, we revisit prior work that also focused on the problem of data optimization for robust classification \citep{Ily+19}, and show that being robust to adversarial attacks after standard (gradient descent) training on a suitable dataset is more challenging than previously thought.

no code implementations • 29 Sep 2021 • Nikolaos Tsilivis, Julia Kempe

In particular, in the regime where the Neural Tangent Kernel theory holds, we derive a simple, but powerful strategy for attacking models, which in contrast to prior work, does not require any access to the model under attack, or any trained replica of it for that matter.

1 code implementation • 5 Jul 2021 • Artem Vysogorets, Julia Kempe

Neural network pruning is a fruitful area of research with surging interest in high sparsity regimes.

1 code implementation • 13 Mar 2003 • Julia Kempe

This article aims to provide an introductory survey on quantum random walks.

Quantum Physics Data Structures and Algorithms

no code implementations • 18 Dec 2000 • Dorit Aharonov, Andris Ambainis, Julia Kempe, Umesh Vazirani

We set the ground for a theory of quantum walks on graphs- the generalization of random walks on finite graphs to the quantum world.

Quantum Physics

no code implementations • 15 Aug 2000 • Dave Bacon, Andrew M. Childs, Isaac L. Chuang, Julia Kempe, Debbie W. Leung, Xinlan Zhou

Although the conditions for performing arbitrary unitary operations to simulate the dynamics of a closed quantum system are well understood, the same is not true of the more general class of quantum operations (also known as superoperators) corresponding to the dynamics of open quantum systems.

Quantum Physics

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.