Search Results for author: Gabriel Synnaeve

Found 63 papers, 40 papers with code

Star Temporal Classification: Sequence Classification with Partially Labeled Data

1 code implementation28 Jan 2022 Vineel Pratap, Awni Hannun, Gabriel Synnaeve, Ronan Collobert

These experiments show that STC can recover most of the performance of supervised baseline when up to 70% of the labels are missing.

Automatic Speech Recognition Classification +1

Pseudo-Labeling for Massively Multilingual Speech Recognition

no code implementations30 Oct 2021 Loren Lugosch, Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert

Semi-supervised learning through pseudo-labeling has become a staple of state-of-the-art monolingual speech recognition systems.

Speech Recognition

Hierarchical Skills for Efficient Exploration

1 code implementation NeurIPS 2021 Jonas Gehring, Gabriel Synnaeve, Andreas Krause, Nicolas Usunier

We alleviate the need for prior knowledge by proposing a hierarchical skill learning framework that acquires skills of varying complexity in an unsupervised manner.

Continuous Control Efficient Exploration +2

ASR4REAL: An extended benchmark for speech models

no code implementations16 Oct 2021 Morgane Riviere, Jade Copet, Gabriel Synnaeve

Popular ASR benchmarks such as Librispeech and Switchboard are limited in the diversity of settings and speakers they represent.

Language Modelling

Word Order Does Not Matter For Speech Recognition

no code implementations12 Oct 2021 Vineel Pratap, Qiantong Xu, Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert

In this paper, we study training of automatic speech recognition system in a weakly supervised setting where the order of words in transcript labels of the audio training data is not known.

Automatic Speech Recognition

XCiT: Cross-Covariance Image Transformers

10 code implementations NeurIPS 2021 Alaaeldin El-Nouby, Hugo Touvron, Mathilde Caron, Piotr Bojanowski, Matthijs Douze, Armand Joulin, Ivan Laptev, Natalia Neverova, Gabriel Synnaeve, Jakob Verbeek, Hervé Jegou

We propose a "transposed" version of self-attention that operates across feature channels rather than tokens, where the interactions are based on the cross-covariance matrix between keys and queries.

Instance Segmentation Natural Language Processing +4

Differentiable Model Compression via Pseudo Quantization Noise

1 code implementation20 Apr 2021 Alexandre Défossez, Yossi Adi, Gabriel Synnaeve

Given a single hyper-parameter expressing the desired balance between the quantized model size and accuracy, DiffQ can optimize the number of bits used per individual weight or groups of weights, in a single training.

Audio Source Separation Image Classification +3

Gradient Matching for Domain Generalization

2 code implementations ICLR 2022 Yuge Shi, Jeffrey Seely, Philip H. S. Torr, N. Siddharth, Awni Hannun, Nicolas Usunier, Gabriel Synnaeve

We perform experiments on both the Wilds benchmark, which captures distribution shift in the real world, as well as datasets in DomainBed benchmark that focuses more on synthetic-to-real transfer.

Domain Generalization

Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training

2 code implementations2 Apr 2021 Wei-Ning Hsu, Anuroop Sriram, Alexei Baevski, Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Jacob Kahn, Ann Lee, Ronan Collobert, Gabriel Synnaeve, Michael Auli

On a large-scale competitive setup, we show that pre-training on unlabeled in-domain data reduces the gap between models trained on in-domain and out-of-domain labeled data by 66%-73%.

Self-Supervised Learning

Going deeper with Image Transformers

14 code implementations ICCV 2021 Hugo Touvron, Matthieu Cord, Alexandre Sablayrolles, Gabriel Synnaeve, Hervé Jégou

In particular, we investigate the interplay of architecture and optimization of such dedicated transformers.

 Ranked #1 on Image Classification on CIFAR-10 (using extra training data)

Image Classification Transfer Learning

ROMUL: Scale Adaptative Population Based Training

no code implementations1 Jan 2021 Daniel Haziza, Jérémy Rapin, Gabriel Synnaeve

In most pragmatic settings, data augmentation and regularization are essential, and require hyperparameter search.

Data Augmentation Image Classification +1

MLS: A Large-Scale Multilingual Dataset for Speech Research

1 code implementation7 Dec 2020 Vineel Pratap, Qiantong Xu, Anuroop Sriram, Gabriel Synnaeve, Ronan Collobert

This paper introduces Multilingual LibriSpeech (MLS) dataset, a large multilingual corpus suitable for speech research.

Automatic Speech Recognition

Joint Masked CPC and CTC Training for ASR

1 code implementation30 Oct 2020 Chaitanya Talnikar, Tatiana Likhomanenko, Ronan Collobert, Gabriel Synnaeve

Self-supervised learning (SSL) has shown promise in learning representations of audio that are useful for automatic speech recognition (ASR).

Automatic Speech Recognition Self-Supervised Learning

SlimIPL: Language-Model-Free Iterative Pseudo-Labeling

no code implementations22 Oct 2020 Tatiana Likhomanenko, Qiantong Xu, Jacob Kahn, Gabriel Synnaeve, Ronan Collobert

We improve upon the IPL algorithm: as the model learns, we propose to iteratively re-generate transcriptions with hard labels (the most probable tokens), that is, without a language model.

Automatic Speech Recognition

Rethinking Evaluation in ASR: Are Our Models Robust Enough?

1 code implementation22 Oct 2020 Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Paden Tomasello, Jacob Kahn, Gilad Avidov, Ronan Collobert, Gabriel Synnaeve

Finally, we show that training a single acoustic model on the most widely-used datasets - combined - reaches competitive performance on both research and real-world benchmarks.

Automatic Speech Recognition

Population Based Training for Data Augmentation and Regularization in Speech Recognition

no code implementations8 Oct 2020 Daniel Haziza, Jérémy Rapin, Gabriel Synnaeve

It compares favorably to a baseline that does not change those hyperparameters over the course of training, with an 8% relative WER improvement.

Data Augmentation Speech Recognition

Massively Multilingual ASR: 50 Languages, 1 Model, 1 Billion Parameters

no code implementations6 Jul 2020 Vineel Pratap, Anuroop Sriram, Paden Tomasello, Awni Hannun, Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

We study training a single acoustic model for multiple languages with the aim of improving automatic speech recognition (ASR) performance on low-resource languages, and over-all simplifying deployment of ASR systems that support diverse languages.

Automatic Speech Recognition

Data Augmenting Contrastive Learning of Speech Representations in the Time Domain

1 code implementation2 Jul 2020 Eugene Kharitonov, Morgane Rivière, Gabriel Synnaeve, Lior Wolf, Pierre-Emmanuel Mazaré, Matthijs Douze, Emmanuel Dupoux

Contrastive Predictive Coding (CPC), based on predicting future segments of speech based on past segments is emerging as a powerful algorithm for representation learning of speech signal.

Contrastive Learning Data Augmentation +1

Real Time Speech Enhancement in the Waveform Domain

2 code implementations23 Jun 2020 Alexandre Defossez, Gabriel Synnaeve, Yossi Adi

The proposed model matches state-of-the-art performance of both causal and non causal methods while working directly on the raw waveform.

Data Augmentation Speech Enhancement

Semi-Supervised Speech Recognition via Local Prior Matching

1 code implementation24 Feb 2020 Wei-Ning Hsu, Ann Lee, Gabriel Synnaeve, Awni Hannun

For sequence transduction tasks like speech recognition, a strong structured prior model encodes rich information about the target space, implicitly ruling out invalid sequences by assigning them low probability.

Knowledge Distillation Speech Recognition

Scaling Up Online Speech Recognition Using ConvNets

no code implementations27 Jan 2020 Vineel Pratap, Qiantong Xu, Jacob Kahn, Gilad Avidov, Tatiana Likhomanenko, Awni Hannun, Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

We design an online end-to-end speech recognition system based on Time-Depth Separable (TDS) convolutions and Connectionist Temporal Classification (CTC).

Speech Recognition

Libri-Light: A Benchmark for ASR with Limited or No Supervision

1 code implementation17 Dec 2019 Jacob Kahn, Morgane Rivière, Weiyi Zheng, Evgeny Kharitonov, Qiantong Xu, Pierre-Emmanuel Mazaré, Julien Karadayi, Vitaliy Liptchinsky, Ronan Collobert, Christian Fuegen, Tatiana Likhomanenko, Gabriel Synnaeve, Armand Joulin, Abdel-rahman Mohamed, Emmanuel Dupoux

Additionally, we provide baseline systems and evaluation metrics working under three settings: (1) the zero resource/unsupervised setting (ABX), (2) the semi-supervised setting (PER, CER) and (3) the distant supervision setting (WER).

 Ranked #1 on Speech Recognition on Libri-Light test-other (ABX-across metric)

Speech Recognition

End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures

1 code implementation19 Nov 2019 Gabriel Synnaeve, Qiantong Xu, Jacob Kahn, Tatiana Likhomanenko, Edouard Grave, Vineel Pratap, Anuroop Sriram, Vitaliy Liptchinsky, Ronan Collobert

We study pseudo-labeling for the semi-supervised training of ResNet, Time-Depth Separable ConvNets, and Transformers for speech recognition, with either CTC or Seq2Seq loss functions.

Ranked #14 on Speech Recognition on LibriSpeech test-other (using extra training data)

Speech Recognition

Deja-vu: Double Feature Presentation and Iterated Loss in Deep Transformer Networks

no code implementations23 Oct 2019 Andros Tjandra, Chunxi Liu, Frank Zhang, Xiaohui Zhang, Yongqiang Wang, Gabriel Synnaeve, Satoshi Nakamura, Geoffrey Zweig

As our motivation is to allow acoustic models to re-examine their input features in light of partial hypotheses we introduce intermediate model heads and loss function.

A Structured Prediction Approach for Generalization in Cooperative Multi-Agent Reinforcement Learning

1 code implementation NeurIPS 2019 Nicolas Carion, Gabriel Synnaeve, Alessandro Lazaric, Nicolas Usunier

While centralized reinforcement learning methods can optimally solve small MAC instances, they do not scale to large problems and they fail to generalize to scenarios different from those seen during training.

Multi-agent Reinforcement Learning reinforcement-learning +2

Self-Supervised Speech Recognition via Local Prior Matching

no code implementations25 Sep 2019 Wei-Ning Hsu, Ann Lee, Gabriel Synnaeve, Awni Hannun

We propose local prior matching (LPM), a self-supervised objective for speech recognition.

Speech Recognition

Why Build an Assistant in Minecraft?

1 code implementation22 Jul 2019 Arthur Szlam, Jonathan Gray, Kavya Srinet, Yacine Jernite, Armand Joulin, Gabriel Synnaeve, Douwe Kiela, Haonan Yu, Zhuoyuan Chen, Siddharth Goyal, Demi Guo, Danielle Rothermel, C. Lawrence Zitnick, Jason Weston

In this document we describe a rationale for a research program aimed at building an open "assistant" in the game Minecraft, in order to make progress on the problems of natural language understanding and learning from dialogue.

Natural Language Understanding

Growing Action Spaces

1 code implementation ICML 2020 Gregory Farquhar, Laura Gustafson, Zeming Lin, Shimon Whiteson, Nicolas Usunier, Gabriel Synnaeve

In complex tasks, such as those with large combinatorial action spaces, random exploration may be too inefficient to achieve meaningful learning progress.

reinforcement-learning Starcraft

Who Needs Words? Lexicon-Free Speech Recognition

no code implementations9 Apr 2019 Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert

Lexicon-free speech recognition naturally deals with the problem of out-of-vocabulary (OOV) words.

Speech Recognition

A Fully Differentiable Beam Search Decoder

1 code implementation16 Feb 2019 Ronan Collobert, Awni Hannun, Gabriel Synnaeve

We demonstrate our approach scales by applying it to speech recognition, jointly training acoustic and word-level language models.

Speech Recognition

Fully Convolutional Speech Recognition

no code implementations17 Dec 2018 Neil Zeghidour, Qiantong Xu, Vitaliy Liptchinsky, Nicolas Usunier, Gabriel Synnaeve, Ronan Collobert

In this paper we present an alternative approach based solely on convolutional neural networks, leveraging recent advances in acoustic models from the raw waveform and language modeling.

Speech Recognition

To Reverse the Gradient or Not: An Empirical Comparison of Adversarial and Multi-task Learning in Speech Recognition

no code implementations9 Dec 2018 Yossi Adi, Neil Zeghidour, Ronan Collobert, Nicolas Usunier, Vitaliy Liptchinsky, Gabriel Synnaeve

In multi-task learning, the goal is speaker prediction; we expect a performance improvement with this joint training if the two tasks of speech recognition and speaker recognition share a common set of underlying features.

Multi-Task Learning Speaker Recognition +1

Forward Modeling for Partial Observation Strategy Games - A StarCraft Defogger

1 code implementation ICLR 2018 Gabriel Synnaeve, Zeming Lin, Jonas Gehring, Dan Gant, Vegard Mella, Vasil Khalidov, Nicolas Carion, Nicolas Usunier

We formulate the problem of defogging as state estimation and future state prediction from previous, partial observations in the context of real-time strategy games.


High-Level Strategy Selection under Partial Observability in StarCraft: Brood War

no code implementations21 Nov 2018 Jonas Gehring, Da Ju, Vegard Mella, Daniel Gant, Nicolas Usunier, Gabriel Synnaeve

We consider the problem of high-level strategy selection in the adversarial setting of real-time strategy games from a reinforcement learning perspective, where taking an action corresponds to switching to the respective strategy.

reinforcement-learning Starcraft

End-to-End Speech Recognition From the Raw Waveform

1 code implementation19 Jun 2018 Neil Zeghidour, Nicolas Usunier, Gabriel Synnaeve, Ronan Collobert, Emmanuel Dupoux

In this paper, we study end-to-end systems trained directly from the raw waveform, building on two alternatives for trainable replacements of mel-filterbanks that use a convolutional architecture.

Speech Recognition

Value Propagation Networks

no code implementations ICLR 2018 Nantas Nardelli, Gabriel Synnaeve, Zeming Lin, Pushmeet Kohli, Philip H. S. Torr, Nicolas Usunier

We present Value Propagation (VProp), a set of parameter-efficient differentiable planning modules built on Value Iteration which can successfully be trained using reinforcement learning to solve unseen tasks, has the capability to generalize to larger map sizes, and can learn to navigate in dynamic environments.

reinforcement-learning Starcraft

Gated ConvNets for Letter-Based ASR

no code implementations ICLR 2018 Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

In this paper we introduce a new speech recognition system, leveraging a simple letter-based ConvNet acoustic model.

Speech Recognition

Letter-Based Speech Recognition with Gated ConvNets

2 code implementations22 Dec 2017 Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

In the recent literature, "end-to-end" speech systems often refer to letter-based acoustic models trained in a sequence-to-sequence manner, either via a recurrent model or via a structured output learning approach (such as CTC).

Speech Recognition

Learning Filterbanks from Raw Speech for Phone Recognition

2 code implementations3 Nov 2017 Neil Zeghidour, Nicolas Usunier, Iasonas Kokkinos, Thomas Schatz, Gabriel Synnaeve, Emmanuel Dupoux

We train a bank of complex filters that operates on the raw waveform and is fed into a convolutional neural network for end-to-end phone recognition.

STARDATA: A StarCraft AI Research Dataset

1 code implementation7 Aug 2017 Zeming Lin, Jonas Gehring, Vasil Khalidov, Gabriel Synnaeve

We provide full game state data along with the original replays that can be viewed in StarCraft.

Imitation Learning Starcraft

Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play

3 code implementations ICLR 2018 Sainbayar Sukhbaatar, Zeming Lin, Ilya Kostrikov, Gabriel Synnaeve, Arthur Szlam, Rob Fergus

When Bob is deployed on an RL task within the environment, this unsupervised training reduces the number of supervised episodes needed to learn, and in some cases converges to a higher reward.

TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games

2 code implementations1 Nov 2016 Gabriel Synnaeve, Nantas Nardelli, Alex Auvolat, Soumith Chintala, Timothée Lacroix, Zeming Lin, Florian Richoux, Nicolas Usunier

We present TorchCraft, a library that enables deep learning research on Real-Time Strategy (RTS) games such as StarCraft: Brood War, by making it easier to control these games from a machine learning framework, here Torch.


Wav2Letter: an End-to-End ConvNet-based Speech Recognition System

8 code implementations arXiv 2016 Ronan Collobert, Christian Puhrsch, Gabriel Synnaeve

This paper presents a simple end-to-end model for speech recognition, combining a convolutional network based acoustic model and a graph decoding.

Speech Recognition

MazeBase: A Sandbox for Learning from Games

2 code implementations23 Nov 2015 Sainbayar Sukhbaatar, Arthur Szlam, Gabriel Synnaeve, Soumith Chintala, Rob Fergus

This paper introduces MazeBase: an environment for simple 2D games, designed as a sandbox for machine learning approaches to reasoning and planning.

reinforcement-learning Starcraft

Weakly Supervised Multi-Embeddings Learning of Acoustic Models

no code implementations20 Dec 2014 Gabriel Synnaeve, Emmanuel Dupoux

We trained a Siamese network with multi-task same/different information on a speech dataset, and found that it was possible to share a network for both tasks without a loss in performance.

A Dataset for StarCraft AI \& an Example of Armies Clustering

1 code implementation19 Nov 2012 Gabriel Synnaeve, Pierre Bessiere

We evaluated this clustering method by predicting the outcomes of battles based on armies compositions' mixtures components


Cannot find the paper you are looking for? You can Submit a new open access paper.