no code implementations • 29 Oct 2024 • Nikolaos Tsilivis, Gal Vardi, Julia Kempe
We study the implicit bias of the general family of steepest descent algorithms, which includes gradient descent, sign descent and coordinate descent, in deep homogeneous neural networks.
no code implementations • 21 Oct 2024 • Matteo Vilucchio, Nikolaos Tsilivis, Bruno Loureiro, Julia Kempe
Indeed, controlling the complexity of the model class is particularly important when data is scarce, noisy or contaminated, as it translates a statistical belief on the underlying structure of the data.
no code implementations • 7 Jun 2024 • Nikolaos Tsilivis, Natalie Frank, Nathan Srebro, Julia Kempe
We study the implicit bias of optimization in robust empirical risk minimization (robust ERM) and its connection with robust generalization.
no code implementations • 27 Apr 2024 • Yunzhen Feng, Tim G. J. Rudner, Nikolaos Tsilivis, Julia Kempe
Adversarial examples have been shown to cause neural networks to fail on a wide range of vision and language tasks, but recent work has claimed that Bayesian neural networks (BNNs) are inherently robust to adversarial perturbations.
no code implementations • 16 Feb 2024 • Benjamin L. Edelman, Ezra Edelman, Surbhi Goel, Eran Malach, Nikolaos Tsilivis
We examine how learning is affected by varying the prior distribution over Markov chains, and consider the generalization of our in-context learning of Markov chains (ICL-MC) task to $n$-grams for $n > 2$.
1 code implementation • 13 Nov 2023 • Jingtong Su, Ya Shi Zhang, Nikolaos Tsilivis, Julia Kempe
We further analyze the geometry of networks that are optimized to be robust against adversarial perturbations of the input, and find that Neural Collapse is a pervasive phenomenon in these cases as well, with clean and perturbed representations forming aligned simplices, and giving rise to a robust simple nearest-neighbor classifier.
no code implementations • 5 Jul 2023 • Francesco Cagnetta, Deborah Oliveira, Mahalakshmi Sabanayagam, Nikolaos Tsilivis, Julia Kempe
Lecture notes from the course given by Professor Julia Kempe at the summer school "Statistical physics of Machine Learning" in Les Houches.
1 code implementation • 21 Mar 2023 • William Merrill, Nikolaos Tsilivis, Aman Shukla
Grokking is a phenomenon where a model trained on an algorithmic task first overfits but, then, after a large amount of additional training, undergoes a phase transition to generalize perfectly.
1 code implementation • 11 Oct 2022 • Nikolaos Tsilivis, Julia Kempe
The adversarial vulnerability of neural nets, and subsequent techniques to create robust models have attracted significant attention; yet we still lack a full understanding of this phenomenon.
1 code implementation • 24 Jul 2022 • Nikolaos Tsilivis, Jingtong Su, Julia Kempe
In parallel, we revisit prior work that also focused on the problem of data optimization for robust classification \citep{Ily+19}, and show that being robust to adversarial attacks after standard (gradient descent) training on a suitable dataset is more challenging than previously thought.
no code implementations • 28 Jan 2022 • William Merrill, Nikolaos Tsilivis
One way to interpret the behavior of a blackbox recurrent neural network (RNN) is to extract from it a more interpretable discrete computational model, like a finite state machine, that captures its behavior.
no code implementations • 29 Sep 2021 • Nikolaos Tsilivis, Julia Kempe
In particular, in the regime where the Neural Tangent Kernel theory holds, we derive a simple, but powerful strategy for attacking models, which in contrast to prior work, does not require any access to the model under attack, or any trained replica of it for that matter.