Search Results for author: David J. Schwab

Found 24 papers, 10 papers with code

Noise driven phase transitions in eco-evolutionary systems

no code implementations12 Oct 2023 Jim Wu, David J. Schwab, Trevor GrandPre

In complex ecosystems such as microbial communities, there is constant ecological and evolutionary feedback between the residing species and the environment occurring on concurrent timescales.

Generalized Information Bottleneck for Gaussian Variables

no code implementations31 Mar 2023 Vudtiwat Ngampruetikorn, David J. Schwab

Here we consider a generalized IB problem, in which the mutual information in the original IB method is replaced by correlation measures based on Renyi and Jeffreys divergences.

Representation Learning

Information bottleneck theory of high-dimensional regression: relevancy, efficiency and optimality

no code implementations8 Aug 2022 Vudtiwat Ngampruetikorn, David J. Schwab

Avoiding overfitting is a central challenge in machine learning, yet many large neural networks readily achieve zero training loss.

regression Vocal Bursts Intensity Prediction

An Empirical Investigation of Domain Generalization with Empirical Risk Minimizers

no code implementations NeurIPS 2021 Ramakrishna Vedantam, David Lopez-Paz, David J. Schwab

Recent work demonstrates that deep neural networks trained using Empirical Risk Minimization (ERM) can generalize under distribution shift, outperforming specialized training algorithms for domain generalization.

Domain Generalization Out-of-Distribution Generalization

Learning Background Invariance Improves Generalization and Robustness in Self-Supervised Learning on ImageNet and Beyond

no code implementations NeurIPS Workshop ImageNet_PPF 2021 Chaitanya Ryali, David J. Schwab, Ari S. Morcos

Through a systematic, comprehensive investigation, we show that background augmentations lead to improved generalization with substantial improvements ($\sim$1-2% on ImageNet) in performance across a spectrum of state-of-the-art self-supervised methods (MoCo-v2, BYOL, SwAV) on a variety of tasks, even enabling performance on par with the supervised baseline.

Data Augmentation Self-Supervised Learning +1

Perturbation Theory for the Information Bottleneck

no code implementations NeurIPS 2021 Vudtiwat Ngampruetikorn, David J. Schwab

Here we derive a perturbation theory for the IB method and report the first complete characterization of the learning onset, the limit of maximum relevant information per bit extracted from data.

Attribute

Implicit Regularization of SGD via Thermophoresis

no code implementations1 Jan 2021 Mingwei Wei, David J. Schwab

The strength of this effect is proportional to squared learning rate and inverse batch size, and is more effective during the early phase of training when the model's predictions are poor.

Are all negatives created equal in contrastive instance discrimination?

no code implementations13 Oct 2020 Tiffany Tianhui Cai, Jonathan Frankle, David J. Schwab, Ari S. Morcos

Using methodology from MoCo v2 (Chen et al., 2020), we divided negatives by their difficulty for a given query and studied which difficulty ranges were most important for learning useful representations.

Image Classification Self-Supervised Learning

Learning Optimal Representations with the Decodable Information Bottleneck

1 code implementation NeurIPS 2020 Yann Dubois, Douwe Kiela, David J. Schwab, Ramakrishna Vedantam

We address the question of characterizing and finding optimal representations for supervised learning.

Theory of gating in recurrent neural networks

no code implementations29 Jul 2020 Kamesh Krishnamurthy, Tankut Can, David J. Schwab

The gate modulating the dimensionality can induce a novel, discontinuous chaotic transition, where inputs push a stable system to strong chaotic activity, in contrast to the typically stabilizing effect of inputs.

Training BatchNorm and Only BatchNorm: On the Expressive Power of Random Features in CNNs

4 code implementations ICLR 2021 Jonathan Frankle, David J. Schwab, Ari S. Morcos

A wide variety of deep learning techniques from style transfer to multitask learning rely on training affine transformations of features.

Style Transfer

The Early Phase of Neural Network Training

1 code implementation ICLR 2020 Jonathan Frankle, David J. Schwab, Ari S. Morcos

We perform extensive measurements of the network state during these early iterations of training and leverage the framework of Frankle et al. (2019) to quantitatively probe the weight distribution and its reliance on various aspects of the dataset.

Gating creates slow modes and controls phase-space complexity in GRUs and LSTMs

no code implementations31 Jan 2020 Tankut Can, Kamesh Krishnamurthy, David J. Schwab

Here, we take the perspective of studying randomly initialized LSTMs and GRUs as dynamical systems, and ask how the salient dynamical properties are shaped by the gates.

How noise affects the Hessian spectrum in overparameterized neural networks

no code implementations1 Oct 2019 Mingwei Wei, David J. Schwab

Stochastic gradient descent (SGD) forms the core optimization method for deep neural networks.

Mean-field Analysis of Batch Normalization

no code implementations ICLR 2019 Mingwei Wei, James Stokes, David J. Schwab

Batch Normalization (BatchNorm) is an extremely useful component of modern neural network architectures, enabling optimization using higher learning rates and achieving faster convergence.

Energy consumption and cooperation for optimal sensing

1 code implementation11 Sep 2018 Vudtiwat Ngampruetikorn, David J. Schwab, Greg J. Stephens

The reliable detection of environmental molecules in the presence of noise is an important cellular function, yet the underlying computational mechanisms are not well understood.

Biological Physics

A high-bias, low-variance introduction to Machine Learning for physicists

7 code implementations23 Mar 2018 Pankaj Mehta, Marin Bukov, Ching-Hao Wang, Alexandre G. R. Day, Clint Richardson, Charles K. Fisher, David J. Schwab

The purpose of this review is to provide an introduction to the core concepts and tools of machine learning in a manner easily understood and intuitive to physicists.

BIG-bench Machine Learning Clustering +2

The information bottleneck and geometric clustering

1 code implementation27 Dec 2017 DJ Strouse, David J. Schwab

The information bottleneck (IB) approach to clustering takes a joint distribution $P\!\left(X, Y\right)$ and maps the data $X$ to cluster labels $T$ which retain maximal information about $Y$ (Tishby et al., 1999).

Clustering Model Selection

Supervised Learning with Tensor Networks

1 code implementation NeurIPS 2016 Edwin Stoudenmire, David J. Schwab

Tensor networks are approximations of high-order tensors which are efficient to work with and have been very successful for physics and mathematics applications.

General Classification Tensor Networks

Comment on "Why does deep and cheap learning work so well?" [arXiv:1608.08225]

no code implementations12 Sep 2016 David J. Schwab, Pankaj Mehta

", Lin and Tegmark claim to show that the mapping between deep belief networks and the variational renormalization group derived in [arXiv:1410. 3831] is invalid, and present a "counterexample" that claims to show that this mapping does not hold.

Supervised Learning with Quantum-Inspired Tensor Networks

4 code implementations18 May 2016 E. Miles Stoudenmire, David J. Schwab

Tensor networks are efficient representations of high-dimensional tensors which have been very successful for physics and mathematics applications.

General Classification Tensor Networks

The deterministic information bottleneck

2 code implementations1 Apr 2016 DJ Strouse, David J. Schwab

Here, we introduce an alternative formulation that replaces mutual information with entropy, which we call the deterministic information bottleneck (DIB), that we argue better captures this notion of compression.

Clustering Computational Efficiency

An exact mapping between the Variational Renormalization Group and Deep Learning

4 code implementations14 Oct 2014 Pankaj Mehta, David J. Schwab

Here, we show that deep learning is intimately related to one of the most important and successful techniques in theoretical physics, the renormalization group (RG).

speech-recognition Speech Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.