no code implementations • 3 Oct 2023 • Anas Mahmoud, Mostafa Elhoushi, Amro Abbas, Yu Yang, Newsha Ardalani, Hugh Leather, Ari Morcos
Vision-Language Models (VLMs) are pretrained on large, diverse, and noisy web-crawled datasets.
1 code implementation • NeurIPS 2023 • Mitchell Wortsman, Tim Dettmers, Luke Zettlemoyer, Ari Morcos, Ali Farhadi, Ludwig Schmidt
We introduce new methods for 1) accelerating and 2) stabilizing training for large language-vision models.
no code implementations • 25 Apr 2023 • Shashank Shekhar, Florian Bordes, Pascal Vincent, Ari Morcos
Here, we aim to explain these differences by analyzing the impact of these objectives on the structure and transferability of the learned representations.
1 code implementation • 24 Apr 2023 • Randall Balestriero, Mark Ibrahim, Vlad Sobal, Ari Morcos, Shashank Shekhar, Tom Goldstein, Florian Bordes, Adrien Bardes, Gregoire Mialon, Yuandong Tian, Avi Schwarzschild, Andrew Gordon Wilson, Jonas Geiping, Quentin Garrido, Pierre Fernandez, Amir Bar, Hamed Pirsiavash, Yann Lecun, Micah Goldblum
Self-supervised learning, dubbed the dark matter of intelligence, is a promising path to advance machine learning.
no code implementations • 24 Oct 2022 • Mark Ibrahim, Diane Bouchacourt, Ari Morcos
Our approach applies the formalism of Lie groups to capture continuous transformations to improve models' robustness to distributional shifts.
no code implementations • 24 Oct 2022 • Mark Ibrahim, Quentin Garrido, Ari Morcos, Diane Bouchacourt
We study not only how robust recent state-of-the-art models are, but also the extent to which models can generalize variation in factors when they're present during training.
no code implementations • 15 Oct 2021 • Jose Javier Gonzalez Ortiz, Jonathan Frankle, Mike Rabbat, Ari Morcos, Nicolas Ballas
As datasets and models become increasingly large, distributed training has become a necessary component to allow deep neural networks to train in reasonable amounts of time.
no code implementations • 10 Jun 2021 • Stéphane d'Ascoli, Levent Sagun, Giulio Biroli, Ari Morcos
Finally, we experiment initializing the T-CNN from a partially trained CNN, and find that it reaches better performance than the corresponding hybrid model trained from scratch, while reducing training time.
9 code implementations • 19 Mar 2021 • Stéphane d'Ascoli, Hugo Touvron, Matthew Leavitt, Ari Morcos, Giulio Biroli, Levent Sagun
We initialise the GPSA layers to mimic the locality of convolutional layers, then give each attention head the freedom to escape locality by adjusting a gating parameter regulating the attention paid to position versus content information.
Ranked #463 on
Image Classification
on ImageNet
no code implementations • 22 Oct 2020 • Matthew L. Leavitt, Ari Morcos
Methods for understanding the decisions of and mechanisms underlying deep neural networks (DNNs) typically rely on building intuition by emphasizing sensory or semantic features of individual examples.
no code implementations • 14 Oct 2020 • Matthew L. Leavitt, Ari Morcos
Furthermore, the input-unit gradient is more variable across samples and units in high-selectivity networks compared to low-selectivity networks.
1 code implementation • 6 Oct 2020 • Ramakrishna Vedantam, Arthur Szlam, Maximilian Nickel, Ari Morcos, Brenden Lake
Humans can learn and reason under substantial uncertainty in a space of infinitely many concepts, including structured relational concepts ("a scene with objects that have the same color") and ad-hoc categories defined through goals ("objects that could fall on one's head").
no code implementations • 12 Mar 2020 • Erik Wijmans, Julian Straub, Dhruv Batra, Irfan Essa, Judy Hoffman, Ari Morcos
Recent advances in deep reinforcement learning require a large amount of training data and generally result in representations that are often over specialized to the target task.
no code implementations • ICLR 2021 • Matthew L. Leavitt, Ari Morcos
For ResNet20 trained on CIFAR10 we could reduce class selectivity by a factor of 2. 5 with no impact on test accuracy, and reduce it nearly to zero with only a small ($\sim$2%) drop in test accuracy.
no code implementations • 26 Feb 2020 • Or Litany, Ari Morcos, Srinath Sridhar, Leonidas Guibas, Judy Hoffman
We seek to learn a representation on a large annotated data source that generalizes to a target domain using limited new supervision.
no code implementations • 10 Jan 2020 • Mathilde Caron, Ari Morcos, Piotr Bojanowski, Julien Mairal, Armand Joulin
In this work, we investigate the use of standard pruning methods, developed primarily for supervised learning, for networks trained without labels (i. e. on self-supervised tasks).
no code implementations • ICLR 2020 • Erik Wijmans, Julian Straub, Irfan Essa, Dhruv Batra, Judy Hoffman, Ari Morcos
Surprisingly, we find that slight differences in task have no measurable effect on the visual representation for both SqueezeNet and ResNet architectures.
8 code implementations • ICLR 2020 • Erik Wijmans, Abhishek Kadian, Ari Morcos, Stefan Lee, Irfan Essa, Devi Parikh, Manolis Savva, Dhruv Batra
We leverage this scaling to train an agent for 2. 5 Billion steps of experience (the equivalent of 80 years of human experience) -- over 6 months of GPU-time training in under 3 days of wall-clock time with 64 GPUs.
Ranked #1 on
PointGoal Navigation
on Gibson PointGoal Navigation
no code implementations • 25 Sep 2019 • Mathilde Caron, Ari Morcos, Piotr Bojanowski, Julien Mairal, Armand Joulin
The lottery ticket hypothesis argues that neural networks contain sparse subnetworks, which, if appropriately initialized (the winning tickets), are capable of matching the accuracy of the full network when trained in isolation.
1 code implementation • 31 May 2019 • Yuandong Tian, Tina Jiang, Qucheng Gong, Ari Morcos
We analyze the dynamics of training deep ReLU networks and their implications on generalization capability.