Search Results for author: Ari Morcos

Found 20 papers, 6 papers with code

Objectives Matter: Understanding the Impact of Self-Supervised Objectives on Vision Transformer Representations

no code implementations25 Apr 2023 Shashank Shekhar, Florian Bordes, Pascal Vincent, Ari Morcos

Here, we aim to explain these differences by analyzing the impact of these objectives on the structure and transferability of the learned representations.

Self-Supervised Learning Specificity

Robust Self-Supervised Learning with Lie Groups

no code implementations24 Oct 2022 Mark Ibrahim, Diane Bouchacourt, Ari Morcos

Our approach applies the formalism of Lie groups to capture continuous transformations to improve models' robustness to distributional shifts.

Self-Supervised Learning

The Robustness Limits of SoTA Vision Models to Natural Variation

no code implementations24 Oct 2022 Mark Ibrahim, Quentin Garrido, Ari Morcos, Diane Bouchacourt

We study not only how robust recent state-of-the-art models are, but also the extent to which models can generalize variation in factors when they're present during training.

Trade-offs of Local SGD at Scale: An Empirical Study

no code implementations15 Oct 2021 Jose Javier Gonzalez Ortiz, Jonathan Frankle, Mike Rabbat, Ari Morcos, Nicolas Ballas

As datasets and models become increasingly large, distributed training has become a necessary component to allow deep neural networks to train in reasonable amounts of time.

Image Classification

Transformed CNNs: recasting pre-trained convolutional layers with self-attention

no code implementations10 Jun 2021 Stéphane d'Ascoli, Levent Sagun, Giulio Biroli, Ari Morcos

Finally, we experiment initializing the T-CNN from a partially trained CNN, and find that it reaches better performance than the corresponding hybrid model trained from scratch, while reducing training time.

ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases

9 code implementations19 Mar 2021 Stéphane d'Ascoli, Hugo Touvron, Matthew Leavitt, Ari Morcos, Giulio Biroli, Levent Sagun

We initialise the GPSA layers to mimic the locality of convolutional layers, then give each attention head the freedom to escape locality by adjusting a gating parameter regulating the attention paid to position versus content information.

Image Classification Inductive Bias

Towards falsifiable interpretability research

no code implementations22 Oct 2020 Matthew L. Leavitt, Ari Morcos

Methods for understanding the decisions of and mechanisms underlying deep neural networks (DNNs) typically rely on building intuition by emphasizing sensory or semantic features of individual examples.

Linking average- and worst-case perturbation robustness via class selectivity and dimensionality

no code implementations14 Oct 2020 Matthew L. Leavitt, Ari Morcos

Furthermore, the input-unit gradient is more variable across samples and units in high-selectivity networks compared to low-selectivity networks.

CURI: A Benchmark for Productive Concept Learning Under Uncertainty

1 code implementation6 Oct 2020 Ramakrishna Vedantam, Arthur Szlam, Maximilian Nickel, Ari Morcos, Brenden Lake

Humans can learn and reason under substantial uncertainty in a space of infinitely many concepts, including structured relational concepts ("a scene with objects that have the same color") and ad-hoc categories defined through goals ("objects that could fall on one's head").

Meta-Learning Systematic Generalization

Analyzing Visual Representations in Embodied Navigation Tasks

no code implementations12 Mar 2020 Erik Wijmans, Julian Straub, Dhruv Batra, Irfan Essa, Judy Hoffman, Ari Morcos

Recent advances in deep reinforcement learning require a large amount of training data and generally result in representations that are often over specialized to the target task.

Reinforcement Learning (RL)

Selectivity considered harmful: evaluating the causal impact of class selectivity in DNNs

no code implementations ICLR 2021 Matthew L. Leavitt, Ari Morcos

For ResNet20 trained on CIFAR10 we could reduce class selectivity by a factor of 2. 5 with no impact on test accuracy, and reduce it nearly to zero with only a small ($\sim$2%) drop in test accuracy.

Open-Ended Question Answering Test

Representation Learning Through Latent Canonicalizations

no code implementations26 Feb 2020 Or Litany, Ari Morcos, Srinath Sridhar, Leonidas Guibas, Judy Hoffman

We seek to learn a representation on a large annotated data source that generalizes to a target domain using limited new supervision.


Pruning Convolutional Neural Networks with Self-Supervision

no code implementations10 Jan 2020 Mathilde Caron, Ari Morcos, Piotr Bojanowski, Julien Mairal, Armand Joulin

In this work, we investigate the use of standard pruning methods, developed primarily for supervised learning, for networks trained without labels (i. e. on self-supervised tasks).

Insights on Visual Representations for Embodied Navigation Tasks

no code implementations ICLR 2020 Erik Wijmans, Julian Straub, Irfan Essa, Dhruv Batra, Judy Hoffman, Ari Morcos

Surprisingly, we find that slight differences in task have no measurable effect on the visual representation for both SqueezeNet and ResNet architectures.

DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames

8 code implementations ICLR 2020 Erik Wijmans, Abhishek Kadian, Ari Morcos, Stefan Lee, Irfan Essa, Devi Parikh, Manolis Savva, Dhruv Batra

We leverage this scaling to train an agent for 2. 5 Billion steps of experience (the equivalent of 80 years of human experience) -- over 6 months of GPU-time training in under 3 days of wall-clock time with 64 GPUs.

Autonomous Navigation Navigate +2

Finding Winning Tickets with Limited (or No) Supervision

no code implementations25 Sep 2019 Mathilde Caron, Ari Morcos, Piotr Bojanowski, Julien Mairal, Armand Joulin

The lottery ticket hypothesis argues that neural networks contain sparse subnetworks, which, if appropriately initialized (the winning tickets), are capable of matching the accuracy of the full network when trained in isolation.

Luck Matters: Understanding Training Dynamics of Deep ReLU Networks

1 code implementation31 May 2019 Yuandong Tian, Tina Jiang, Qucheng Gong, Ari Morcos

We analyze the dynamics of training deep ReLU networks and their implications on generalization capability.

Cannot find the paper you are looking for? You can Submit a new open access paper.