Search Results for author: Armand Joulin

Found 70 papers, 50 papers with code

Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision

1 code implementation16 Feb 2022 Priya Goyal, Quentin Duval, Isaac Seessel, Mathilde Caron, Ishan Misra, Levent Sagun, Armand Joulin, Piotr Bojanowski

Discriminative self-supervised learning allows training models on any random group of internet images, and possibly recover salient information that helps differentiate between the images.

 Ranked #1 on Copy Detection on Copydays strong subset (using extra training data)

Action Classification Action Recognition +10

Omnivore: A Single Model for Many Visual Modalities

1 code implementation20 Jan 2022 Rohit Girdhar, Mannat Singh, Nikhila Ravi, Laurens van der Maaten, Armand Joulin, Ishan Misra

Prior work has studied different visual modalities in isolation and developed separate architectures for recognition of images, videos, and 3D data.

 Ranked #1 on Scene Recognition on SUN-RGBD (using extra training data)

Action Classification Action Recognition +3

Detecting Twenty-thousand Classes using Image-level Supervision

1 code implementation7 Jan 2022 Xingyi Zhou, Rohit Girdhar, Armand Joulin, Phillip Krähenbühl, Ishan Misra

For the first time, we train a detector with all the twenty-one-thousand classes of the ImageNet dataset and show that it generalizes to new datasets without fine-tuning.

Image Classification

Learning Co-segmentation by Segment Swapping for Retrieval and Discovery

1 code implementation29 Oct 2021 Xi Shen, Alexei A. Efros, Armand Joulin, Mathieu Aubry

The goal of this work is to efficiently identify visually similar patterns in images, e. g. identifying an artwork detail copied between an engraving and an oil painting, or recognizing parts of a night-time photograph visible in its daytime counterpart.

Graph Clustering Object Discovery +2

Contrastive Pre-training for Zero-Shot Information Retrieval

no code implementations29 Sep 2021 Gautier Izacard, Mathilde Caron, Lucas Hosseini, Sebastian Riedel, Piotr Bojanowski, Armand Joulin, Edouard Grave

By contrast, in many other NLP tasks, conventional self-supervised pre-training based on masking leads to strong generalization with small number of training examples.

Contrastive Learning Fact Checking +2

XCiT: Cross-Covariance Image Transformers

10 code implementations NeurIPS 2021 Alaaeldin El-Nouby, Hugo Touvron, Mathilde Caron, Piotr Bojanowski, Matthijs Douze, Armand Joulin, Ivan Laptev, Natalia Neverova, Gabriel Synnaeve, Jakob Verbeek, Hervé Jegou

We propose a "transposed" version of self-attention that operates across feature channels rather than tokens, where the interactions are based on the cross-covariance matrix between keys and queries.

Instance Segmentation Object Detection +2

Emerging Properties in Self-Supervised Vision Transformers

16 code implementations ICCV 2021 Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, Armand Joulin

In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets).

Copy Detection Self-Supervised Image Classification +5

Self-Supervised Pretraining of 3D Features on any Point-Cloud

1 code implementation ICCV 2021 Zaiwei Zhang, Rohit Girdhar, Armand Joulin, Ishan Misra

Pretraining on large labeled datasets is a prerequisite to achieve good performance in many computer vision tasks like 2D object recognition, video classification etc.

Object Detection Object Recognition +2

Beyond English-Centric Multilingual Machine Translation

4 code implementations21 Oct 2020 Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin

Existing work in translation demonstrated the potential of massively multilingual machine translation by training a single model able to translate between any pair of languages.

Machine Translation Translation

Target Conditioning for One-to-Many Generation

no code implementations Findings of the Association for Computational Linguistics 2020 Marie-Anne Lachaux, Armand Joulin, Guillaume Lample

In this paper, we propose to explicitly model this one-to-many mapping by conditioning the decoder of a NMT model on a latent variable that represents the domain of target sentences.

Machine Translation Translation

Unsupervised Learning of Visual Features by Contrasting Cluster Assignments

12 code implementations NeurIPS 2020 Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, Armand Joulin

In addition, we also propose a new data augmentation strategy, multi-crop, that uses a mix of views with different resolutions in place of two full-resolution views, without increasing the memory or compute requirements much.

Contrastive Learning Data Augmentation +2

Training with Quantization Noise for Extreme Model Compression

3 code implementations ICLR 2021 Angela Fan, Pierre Stock, Benjamin Graham, Edouard Grave, Remi Gribonval, Herve Jegou, Armand Joulin

A standard solution is to train networks with Quantization Aware Training, where the weights are quantized during training and the gradients approximated with the Straight-Through Estimator.

Image Generation Model Compression

Learning to Visually Navigate in Photorealistic Environments Without any Supervision

no code implementations10 Apr 2020 Lina Mezghani, Sainbayar Sukhbaatar, Arthur Szlam, Armand Joulin, Piotr Bojanowski

Learning to navigate in a realistic setting where an agent must rely solely on visual inputs is a challenging task, in part because the lack of position information makes it difficult to provide supervision during training.

Unsupervised pretraining transfers well across languages

2 code implementations7 Feb 2020 Morgane Rivière, Armand Joulin, Pierre-Emmanuel Mazaré, Emmanuel Dupoux

Cross-lingual and multi-lingual training of Automatic Speech Recognition (ASR) has been extensively investigated in the supervised setting.

Automatic Speech Recognition

Pruning Convolutional Neural Networks with Self-Supervision

no code implementations10 Jan 2020 Mathilde Caron, Ari Morcos, Piotr Bojanowski, Julien Mairal, Armand Joulin

In this work, we investigate the use of standard pruning methods, developed primarily for supervised learning, for networks trained without labels (i. e. on self-supervised tasks).

Libri-Light: A Benchmark for ASR with Limited or No Supervision

1 code implementation17 Dec 2019 Jacob Kahn, Morgane Rivière, Weiyi Zheng, Evgeny Kharitonov, Qiantong Xu, Pierre-Emmanuel Mazaré, Julien Karadayi, Vitaliy Liptchinsky, Ronan Collobert, Christian Fuegen, Tatiana Likhomanenko, Gabriel Synnaeve, Armand Joulin, Abdel-rahman Mohamed, Emmanuel Dupoux

Additionally, we provide baseline systems and evaluation metrics working under three settings: (1) the zero resource/unsupervised setting (ABX), (2) the semi-supervised setting (PER, CER) and (3) the distant supervision setting (WER).

 Ranked #1 on Speech Recognition on Libri-Light test-other (ABX-across metric)

Speech Recognition

CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB

3 code implementations ACL 2021 Holger Schwenk, Guillaume Wenzek, Sergey Edunov, Edouard Grave, Armand Joulin

To evaluate the quality of the mined bitexts, we train NMT systems for most of the language pairs and evaluate them on TED, WMT and WAT test sets.


Finding Winning Tickets with Limited (or No) Supervision

no code implementations25 Sep 2019 Mathilde Caron, Ari Morcos, Piotr Bojanowski, Julien Mairal, Armand Joulin

The lottery ticket hypothesis argues that neural networks contain sparse subnetworks, which, if appropriately initialized (the winning tickets), are capable of matching the accuracy of the full network when trained in isolation.

Reducing Transformer Depth on Demand with Structured Dropout

4 code implementations ICLR 2020 Angela Fan, Edouard Grave, Armand Joulin

Overparameterized transformer networks have obtained state of the art results in various natural language processing tasks, such as machine translation, language modeling, and question answering.

Language Modelling Machine Translation +2

Why Build an Assistant in Minecraft?

1 code implementation22 Jul 2019 Arthur Szlam, Jonathan Gray, Kavya Srinet, Yacine Jernite, Armand Joulin, Gabriel Synnaeve, Douwe Kiela, Haonan Yu, Zhuoyuan Chen, Siddharth Goyal, Demi Guo, Danielle Rothermel, C. Lawrence Zitnick, Jason Weston

In this document we describe a rationale for a research program aimed at building an open "assistant" in the game Minecraft, in order to make progress on the problems of natural language understanding and learning from dialogue.

Natural Language Understanding

Augmenting Self-attention with Persistent Memory

6 code implementations2 Jul 2019 Sainbayar Sukhbaatar, Edouard Grave, Guillaume Lample, Herve Jegou, Armand Joulin

More precisely, we augment the self-attention layers with persistent memory vectors that play a similar role as the feed-forward layer.

Language Modelling Translation

Unsupervised Pre-Training of Image Features on Non-Curated Data

2 code implementations ICCV 2019 Mathilde Caron, Piotr Bojanowski, Julien Mairal, Armand Joulin

Our goal is to bridge the performance gap between unsupervised methods trained on curated data, which are costly to obtain, and massive raw datasets that are easily available.

Self-Supervised Image Classification Unsupervised Pre-training

Cooperative Learning of Disjoint Syntax and Semantics

1 code implementation NAACL 2019 Serhii Havrylov, Germán Kruszewski, Armand Joulin

There has been considerable attention devoted to models that learn to jointly infer an expression's syntactic structure and its semantics.

Domain Generalization Natural Language Inference +1

Deep Clustering for Unsupervised Learning of Visual Features

9 code implementations ECCV 2018 Mathilde Caron, Piotr Bojanowski, Armand Joulin, Matthijs Douze

In this work, we present DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features.

Deep Clustering Image Clustering +1

Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion

4 code implementations EMNLP 2018 Armand Joulin, Piotr Bojanowski, Tomas Mikolov, Herve Jegou, Edouard Grave

Continuous word representations learned separately on distinct languages can be aligned so that their words become comparable in a common space.

Translation Word Translation

Learning Word Vectors for 157 Languages

2 code implementations LREC 2018 Edouard Grave, Piotr Bojanowski, Prakhar Gupta, Armand Joulin, Tomas Mikolov

Distributed word representations, or word vectors, have recently been applied to many tasks in natural language processing, leading to state-of-the-art performance.

Advances in Pre-Training Distributed Word Representations

5 code implementations LREC 2018 Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, Armand Joulin

Many Natural Language Processing applications nowadays rely on pre-trained word representations estimated from large text corpora such as news collections, Wikipedia and Web Crawl.

Unbounded cache model for online language modeling with open vocabulary

2 code implementations NeurIPS 2017 Edouard Grave, Moustapha Cisse, Armand Joulin

Recently, continuous cache models were proposed as extensions to recurrent neural network language models, to adapt their predictions to local changes in the data distribution.

Language Modelling Quantization

Fast Linear Model for Knowledge Graph Embeddings

1 code implementation30 Oct 2017 Armand Joulin, Edouard Grave, Piotr Bojanowski, Maximilian Nickel, Tomas Mikolov

This paper shows that a simple baseline based on a Bag-of-Words (BoW) representation learns surprisingly good knowledge graph embeddings.

General Classification Knowledge Base Completion +2

Optimizing the Latent Space of Generative Networks

5 code implementations ICML 2018 Piotr Bojanowski, Armand Joulin, David Lopez-Paz, Arthur Szlam

Generative Adversarial Networks (GANs) have achieved remarkable results in the task of generating realistic natural images.

Unsupervised Learning by Predicting Noise

1 code implementation ICML 2017 Piotr Bojanowski, Armand Joulin

We propose to fix a set of target representations, called Noise As Targets (NAT), and to constrain the deep features to align to them.

CommAI: Evaluating the first steps towards a useful general AI

no code implementations31 Jan 2017 Marco Baroni, Armand Joulin, Allan Jabri, Germàn Kruszewski, Angeliki Lazaridou, Klemen Simonic, Tomas Mikolov

With machine learning successfully applied to new daunting problems almost every day, general AI starts looking like an attainable goal.

Continual Learning General Classification +1

Improving Neural Language Models with a Continuous Cache

13 code implementations13 Dec 2016 Edouard Grave, Armand Joulin, Nicolas Usunier

We propose an extension to neural network language models to adapt their prediction to the recent history.

Language Modelling Compressing text classification models

41 code implementations12 Dec 2016 Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Hérve Jégou, Tomas Mikolov

We consider the problem of producing compact architectures for text classification, such that the full model fits in a limited amount of memory.

Classification General Classification +3

Variable Computation in Recurrent Neural Networks

no code implementations18 Nov 2016 Yacine Jernite, Edouard Grave, Armand Joulin, Tomas Mikolov

Recurrent neural networks (RNNs) have been used extensively and with increasing success to model various types of sequential data.

Efficient softmax approximation for GPUs

12 code implementations ICML 2017 Edouard Grave, Armand Joulin, Moustapha Cissé, David Grangier, Hervé Jégou

We propose an approximate strategy to efficiently train neural network based language models over very large vocabularies.

Enriching Word Vectors with Subword Information

49 code implementations TACL 2017 Piotr Bojanowski, Edouard Grave, Armand Joulin, Tomas Mikolov

A vector representation is associated to each character $n$-gram; words being represented as the sum of these representations.

Word Embeddings Word Similarity

Revisiting Visual Question Answering Baselines

3 code implementations27 Jun 2016 Allan Jabri, Armand Joulin, Laurens van der Maaten

Visual question answering (VQA) is an interesting learning setting for evaluating the abilities and shortcomings of current systems for image understanding.

Multiple-choice Visual Grounding +2

Locally-Optimized Inter-Subject Alignment of Functional Cortical Regions

no code implementations7 Jun 2016 Marius Cătălin Iordan, Armand Joulin, Diane M. Beck, Li Fei-Fei

Our method outperforms the two most commonly used alternatives (anatomical landmark-based AFNI alignment and cortical convexity-based FreeSurfer alignment) in overlap between predicted region and functionally-defined LOC.

A Roadmap towards Machine Intelligence

1 code implementation25 Nov 2015 Tomas Mikolov, Armand Joulin, Marco Baroni

The development of intelligent machines is one of the biggest unsolved challenges in computer science.

Learning Simple Algorithms from Examples

1 code implementation23 Nov 2015 Wojciech Zaremba, Tomas Mikolov, Armand Joulin, Rob Fergus

We present an approach for learning simple algorithms such as copying, multi-digit addition and single digit multiplication directly from examples.


Alternative structures for character-level RNNs

1 code implementation19 Nov 2015 Piotr Bojanowski, Armand Joulin, Tomas Mikolov

The first one consists on conditioning the character level representation on the previous word representation.

Language Modelling

Learning Visual Features from Large Weakly Supervised Data

no code implementations6 Nov 2015 Armand Joulin, Laurens van der Maaten, Allan Jabri, Nicolas Vasilache

We train convolutional networks on a dataset of 100 million Flickr photos and captions, and show that these networks produce features that perform well in a range of vision problems.

Representation Learning Word Similarity

Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets

3 code implementations NeurIPS 2015 Armand Joulin, Tomas Mikolov

Despite the recent achievements in machine learning, we are still very far from achieving real artificial intelligence.

Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks

19 code implementations19 Feb 2015 Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M. Rush, Bart van Merriënboer, Armand Joulin, Tomas Mikolov

One long-term goal of machine learning research is to produce methods that are applicable to reasoning and natural language, in particular building an intelligent dialogue agent.

Question Answering Reading Comprehension

Learning Longer Memory in Recurrent Neural Networks

5 code implementations24 Dec 2014 Tomas Mikolov, Armand Joulin, Sumit Chopra, Michael Mathieu, Marc'Aurelio Ranzato

In this paper, we show that learning longer term patterns in real data, such as in natural language, is perfectly possible using gradient descent.

Language Modelling

Deep Fragment Embeddings for Bidirectional Image Sentence Mapping

no code implementations NeurIPS 2014 Andrej Karpathy, Armand Joulin, Li Fei-Fei

We introduce a model for bidirectional retrieval of images and sentences through a multi-modal embedding of visual and natural language data.

Referring Expression Comprehension

Unsupervised Joint Object Discovery and Segmentation in Internet Images

no code implementations CVPR 2013 Michael Rubinstein, Armand Joulin, Johannes Kopf, Ce Liu

In contrast to previous co-segmentation methods, our algorithm performs well even in the presence of significant amounts of noise images (images not containing a common object), as typical for datasets collected from Internet search.

Object Discovery

Recovering Stereo Pairs from Anaglyphs

no code implementations CVPR 2013 Armand Joulin, Sing Bing Kang

An anaglyph is a single image created by selecting complementary colors from a stereo color pair; the user can perceive depth by viewing it through color-filtered glasses.

Efficient Optimization for Discriminative Latent Class Models

no code implementations NeurIPS 2010 Armand Joulin, Jean Ponce, Francis R. Bach

To avoid this problem, we introduce a local approximation of this cost function, which leads to a quadratic non-convex optimization problem over a product of simplices.

Document Classification General Classification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.