Search Results for author: Stanislav Fort

Found 26 papers, 14 papers with code

Scaling Laws for Adversarial Attacks on Language Model Activations

no code implementations5 Dec 2023 Stanislav Fort

We find that the number of bits of control in the input space needed to control a single bit in the output space (what we call attack resistance $\chi$) is remarkably constant between $\approx 16$ and $\approx 25$ over 2 orders of magnitude of model sizes for different language models.

Language Modelling

Multi-attacks: Many images $+$ the same adversarial attack $\to$ many target labels

1 code implementation4 Aug 2023 Stanislav Fort

We show that we can easily design a single adversarial perturbation $P$ that changes the class of $n$ images $X_1, X_2,\dots, X_n$ from their original, unperturbed classes $c_1, c_2,\dots, c_n$ to desired (not necessarily all the same) classes $c^*_1, c^*_2,\dots, c^*_n$ for up to hundreds of images and target classes at once.

Adversarial Attack

Adversarial vulnerability of powerful near out-of-distribution detection

1 code implementation18 Jan 2022 Stanislav Fort

There has been a significant progress in detecting out-of-distribution (OOD) inputs in neural networks recently, primarily due to the use of large models pretrained on large datasets, and an emerging use of multi-modality.

Adversarial Robustness Out-of-Distribution Detection +1

How many degrees of freedom do we need to train deep networks: a loss landscape perspective

1 code implementation ICLR 2022 Brett W. Larsen, Stanislav Fort, Nic Becker, Surya Ganguli

In particular, we show via Gordon's escape theorem, that the training dimension plus the Gaussian width of the desired loss sub-level set, projected onto a unit sphere surrounding the initialization, must exceed the total number of parameters for the success probability to be large.

A Simple Fix to Mahalanobis Distance for Improving Near-OOD Detection

3 code implementations16 Jun 2021 Jie Ren, Stanislav Fort, Jeremiah Liu, Abhijit Guha Roy, Shreyas Padhy, Balaji Lakshminarayanan

Mahalanobis distance (MD) is a simple and popular post-processing method for detecting out-of-distribution (OOD) inputs in neural networks.

Intent Detection Out-of-Distribution Detection +1

Drawing Multiple Augmentation Samples Per Image During Training Efficiently Decreases Test Error

no code implementations27 May 2021 Stanislav Fort, Andrew Brock, Razvan Pascanu, Soham De, Samuel L. Smith

In this work, we provide a detailed empirical evaluation of how the number of augmentation samples per unique image influences model performance on held out data when training deep ResNets.

Data Augmentation Image Classification

Analyzing Monotonic Linear Interpolation in Neural Network Loss Landscapes

1 code implementation22 Apr 2021 James Lucas, Juhan Bae, Michael R. Zhang, Stanislav Fort, Richard Zemel, Roger Grosse

Linear interpolation between initial neural network parameters and converged parameters after training with stochastic gradient descent (SGD) typically leads to a monotonic decrease in the training objective.

Slice, Dice, and Optimize: Measuring the Dimension of Neural Network Class Manifolds

no code implementations1 Jan 2021 Stanislav Fort, Ekin Dogus Cubuk, Surya Ganguli, Samuel Stern Schoenholz

Deep neural network classifiers naturally partition input space into regions belonging to different classes.

Training independent subnetworks for robust prediction

2 code implementations ICLR 2021 Marton Havasi, Rodolphe Jenatton, Stanislav Fort, Jeremiah Zhe Liu, Jasper Snoek, Balaji Lakshminarayanan, Andrew M. Dai, Dustin Tran

Recent approaches to efficiently ensemble neural networks have shown that strong robustness and uncertainty performance can be achieved with a negligible gain in parameters over the original network.

The Break-Even Point on Optimization Trajectories of Deep Neural Networks

no code implementations ICLR 2020 Stanislaw Jastrzebski, Maciej Szymczak, Stanislav Fort, Devansh Arpit, Jacek Tabor, Kyunghyun Cho, Krzysztof Geras

We argue for the existence of the "break-even" point on this trajectory, beyond which the curvature of the loss surface and noise in the gradient are implicitly regularized by SGD.

Deep Ensembles: A Loss Landscape Perspective

1 code implementation5 Dec 2019 Stanislav Fort, Huiyi Hu, Balaji Lakshminarayanan

One possible explanation for this gap between theory and practice is that popular scalable variational Bayesian methods tend to focus on a single mode, whereas deep ensembles tend to explore diverse modes in function space.

Emergent properties of the local geometry of neural loss landscapes

no code implementations14 Oct 2019 Stanislav Fort, Surya Ganguli

The local geometry of high dimensional neural network loss landscapes can both challenge our cherished theoretical intuitions as well as dramatically impact the practical success of neural network training.

Large Scale Structure of Neural Network Loss Landscapes

1 code implementation NeurIPS 2019 Stanislav Fort, Stanislaw Jastrzebski

There are many surprising and perhaps counter-intuitive properties of optimization of deep neural networks.

Stiffness: A New Perspective on Generalization in Neural Networks

no code implementations28 Jan 2019 Stanislav Fort, Paweł Krzysztof Nowak, Stanislaw Jastrzebski, Srini Narayanan

In particular, we study how stiffness depends on 1) class membership, 2) distance between data points in the input space, 3) training iteration, and 4) learning rate.

Adaptive Quantum State Tomography with Neural Networks

no code implementations17 Dec 2018 Yihui Quek, Stanislav Fort, Hui Khoon Ng

We demonstrate that our algorithm learns to work with basis, symmetric informationally complete (SIC), as well as other types of POVMs.

Quantum State Tomography

The Goldilocks zone: Towards better understanding of neural network loss landscapes

no code implementations6 Jul 2018 Stanislav Fort, Adam Scherlis

We observe this effect for fully-connected neural networks over a range of network widths and depths on MNIST and CIFAR-10 datasets with the $\mathrm{ReLU}$ and $\tanh$ non-linearities, and a similar effect for convolutional networks.

Towards understanding feedback from supermassive black holes using convolutional neural networks

no code implementations2 Dec 2017 Stanislav Fort

Supermassive black holes at centers of clusters of galaxies strongly interact with their host environment via AGN feedback.

Gaussian Prototypical Networks for Few-Shot Learning on Omniglot

1 code implementation ICLR 2018 Stanislav Fort

We show that Gaussian prototypical networks are a preferred architecture over vanilla prototypical networks with an equivalent number of parameters.

Classification Clustering +2

Cannot find the paper you are looking for? You can Submit a new open access paper.