no code implementations • 12 Oct 2023 • Jim Wu, David J. Schwab, Trevor GrandPre
In complex ecosystems such as microbial communities, there is constant ecological and evolutionary feedback between the residing species and the environment occurring on concurrent timescales.
no code implementations • 31 Mar 2023 • Vudtiwat Ngampruetikorn, David J. Schwab
Here we consider a generalized IB problem, in which the mutual information in the original IB method is replaced by correlation measures based on Renyi and Jeffreys divergences.
no code implementations • 8 Aug 2022 • Vudtiwat Ngampruetikorn, David J. Schwab
Avoiding overfitting is a central challenge in machine learning, yet many large neural networks readily achieve zero training loss.
no code implementations • NeurIPS 2021 • Ramakrishna Vedantam, David Lopez-Paz, David J. Schwab
Recent work demonstrates that deep neural networks trained using Empirical Risk Minimization (ERM) can generalize under distribution shift, outperforming specialized training algorithms for domain generalization.
no code implementations • NeurIPS Workshop ImageNet_PPF 2021 • Chaitanya Ryali, David J. Schwab, Ari S. Morcos
Through a systematic, comprehensive investigation, we show that background augmentations lead to improved generalization with substantial improvements ($\sim$1-2% on ImageNet) in performance across a spectrum of state-of-the-art self-supervised methods (MoCo-v2, BYOL, SwAV) on a variety of tasks, even enabling performance on par with the supervised baseline.
no code implementations • NeurIPS 2021 • Vudtiwat Ngampruetikorn, David J. Schwab
Here we derive a perturbation theory for the IB method and report the first complete characterization of the learning onset, the limit of maximum relevant information per bit extracted from data.
no code implementations • 23 Mar 2021 • Chaitanya K. Ryali, David J. Schwab, Ari S. Morcos
Recent progress in self-supervised learning has demonstrated promising results in multiple visual tasks.
Ranked #83 on Image Classification on ObjectNet (using extra training data)
no code implementations • 1 Jan 2021 • Mingwei Wei, David J. Schwab
The strength of this effect is proportional to squared learning rate and inverse batch size, and is more effective during the early phase of training when the model's predictions are poor.
no code implementations • 13 Oct 2020 • Tiffany Tianhui Cai, Jonathan Frankle, David J. Schwab, Ari S. Morcos
Using methodology from MoCo v2 (Chen et al., 2020), we divided negatives by their difficulty for a given query and studied which difficulty ranges were most important for learning useful representations.
1 code implementation • NeurIPS 2020 • Yann Dubois, Douwe Kiela, David J. Schwab, Ramakrishna Vedantam
We address the question of characterizing and finding optimal representations for supervised learning.
no code implementations • 29 Jul 2020 • Kamesh Krishnamurthy, Tankut Can, David J. Schwab
The gate modulating the dimensionality can induce a novel, discontinuous chaotic transition, where inputs push a stable system to strong chaotic activity, in contrast to the typically stabilizing effect of inputs.
4 code implementations • ICLR 2021 • Jonathan Frankle, David J. Schwab, Ari S. Morcos
A wide variety of deep learning techniques from style transfer to multitask learning rely on training affine transformations of features.
1 code implementation • ICLR 2020 • Jonathan Frankle, David J. Schwab, Ari S. Morcos
We perform extensive measurements of the network state during these early iterations of training and leverage the framework of Frankle et al. (2019) to quantitatively probe the weight distribution and its reliance on various aspects of the dataset.
no code implementations • 31 Jan 2020 • Tankut Can, Kamesh Krishnamurthy, David J. Schwab
Here, we take the perspective of studying randomly initialized LSTMs and GRUs as dynamical systems, and ask how the salient dynamical properties are shaped by the gates.
no code implementations • 1 Oct 2019 • Mingwei Wei, David J. Schwab
Stochastic gradient descent (SGD) forms the core optimization method for deep neural networks.
no code implementations • ICLR 2019 • Mingwei Wei, James Stokes, David J. Schwab
Batch Normalization (BatchNorm) is an extremely useful component of modern neural network architectures, enabling optimization using higher learning rates and achieving faster convergence.
1 code implementation • 11 Sep 2018 • Vudtiwat Ngampruetikorn, David J. Schwab, Greg J. Stephens
The reliable detection of environmental molecules in the presence of noise is an important cellular function, yet the underlying computational mechanisms are not well understood.
Biological Physics
7 code implementations • 23 Mar 2018 • Pankaj Mehta, Marin Bukov, Ching-Hao Wang, Alexandre G. R. Day, Clint Richardson, Charles K. Fisher, David J. Schwab
The purpose of this review is to provide an introduction to the core concepts and tools of machine learning in a manner easily understood and intuitive to physicists.
1 code implementation • 27 Dec 2017 • DJ Strouse, David J. Schwab
The information bottleneck (IB) approach to clustering takes a joint distribution $P\!\left(X, Y\right)$ and maps the data $X$ to cluster labels $T$ which retain maximal information about $Y$ (Tishby et al., 1999).
1 code implementation • NeurIPS 2016 • Edwin Stoudenmire, David J. Schwab
Tensor networks are approximations of high-order tensors which are efficient to work with and have been very successful for physics and mathematics applications.
no code implementations • 12 Sep 2016 • David J. Schwab, Pankaj Mehta
", Lin and Tegmark claim to show that the mapping between deep belief networks and the variational renormalization group derived in [arXiv:1410. 3831] is invalid, and present a "counterexample" that claims to show that this mapping does not hold.
4 code implementations • 18 May 2016 • E. Miles Stoudenmire, David J. Schwab
Tensor networks are efficient representations of high-dimensional tensors which have been very successful for physics and mathematics applications.
2 code implementations • 1 Apr 2016 • DJ Strouse, David J. Schwab
Here, we introduce an alternative formulation that replaces mutual information with entropy, which we call the deterministic information bottleneck (DIB), that we argue better captures this notion of compression.
4 code implementations • 14 Oct 2014 • Pankaj Mehta, David J. Schwab
Here, we show that deep learning is intimately related to one of the most important and successful techniques in theoretical physics, the renormalization group (RG).