Search Results for author: Jean-Baptiste Cordonnier

Found 8 papers, 7 papers with code

Differentiable Patch Selection for Image Recognition

no code implementations CVPR 2021 Jean-Baptiste Cordonnier, Aravindh Mahendran, Alexey Dosovitskiy, Dirk Weissenborn, Jakob Uszkoreit, Thomas Unterthiner

Neural Networks require large amounts of memory and compute to process high resolution images, even when only a small part of the image is actually informative for the task at hand.

Traffic Sign Recognition

Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth

1 code implementation5 Mar 2021 Yihe Dong, Jean-Baptiste Cordonnier, Andreas Loukas

Attention-based architectures have become ubiquitous in machine learning, yet our understanding of the reasons for their effectiveness remains limited.

Inductive Bias

Group Equivariant Stand-Alone Self-Attention For Vision

1 code implementation ICLR 2021 David W. Romero, Jean-Baptiste Cordonnier

We provide a general self-attention formulation to impose group equivariance to arbitrary symmetry groups.

Multi-Head Attention: Collaborate Instead of Concatenate

2 code implementations29 Jun 2020 Jean-Baptiste Cordonnier, Andreas Loukas, Martin Jaggi

We also show that it is possible to re-parametrize a pre-trained multi-head attention layer into our collaborative attention layer.

Machine Translation Translation

Robust Cross-lingual Embeddings from Parallel Sentences

2 code implementations28 Dec 2019 Ali Sabet, Prakhar Gupta, Jean-Baptiste Cordonnier, Robert West, Martin Jaggi

Recent advances in cross-lingual word embeddings have primarily relied on mapping-based methods, which project pretrained word embeddings from different languages into a shared space through a linear transformation.

Cross-Lingual Document Classification Cross-Lingual Word Embeddings +5

On the Relationship between Self-Attention and Convolutional Layers

1 code implementation ICLR 2020 Jean-Baptiste Cordonnier, Andreas Loukas, Martin Jaggi

This work provides evidence that attention layers can perform convolution and, indeed, they often learn to do so in practice.

Image Classification

Extrapolating paths with graph neural networks

1 code implementation18 Mar 2019 Jean-Baptiste Cordonnier, Andreas Loukas

We consider the problem of path inference: given a path prefix, i. e., a partially observed sequence of nodes in a graph, we want to predict which nodes are in the missing suffix.

Sparsified SGD with Memory

1 code implementation NeurIPS 2018 Sebastian U. Stich, Jean-Baptiste Cordonnier, Martin Jaggi

Huge scale machine learning problems are nowadays tackled by distributed optimization algorithms, i. e. algorithms that leverage the compute power of many devices for training.

Distributed Optimization Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.