Search Results for author: Gabriel Synnaeve

Found 82 papers, 51 papers with code

Masked Audio Generation using a Single Non-Autoregressive Transformer

no code implementations9 Jan 2024 Alon Ziv, Itai Gat, Gael Le Lan, Tal Remez, Felix Kreuk, Alexandre Défossez, Jade Copet, Gabriel Synnaeve, Yossi Adi

We introduce MAGNeT, a masked generative sequence modeling method that operates directly over several streams of audio tokens.

Audio Generation

CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution

no code implementations5 Jan 2024 Alex Gu, Baptiste Rozière, Hugh Leather, Armando Solar-Lezama, Gabriel Synnaeve, Sida I. Wang

The best setup, GPT-4 with chain of thought (CoT), achieves a pass@1 of 75% and 81% on input and output prediction, respectively.

A Data Source for Reasoning Embodied Agents

1 code implementation14 Sep 2023 Jack Lanchantin, Sainbayar Sukhbaatar, Gabriel Synnaeve, Yuxuan Sun, Kavya Srinet, Arthur Szlam

In this work, to further pursue these advances, we introduce a new data generator for machine reasoning that integrates with an embodied agent.

Code Llama: Open Foundation Models for Code

2 code implementations24 Aug 2023 Baptiste Rozière, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai Gat, Xiaoqing Ellen Tan, Yossi Adi, Jingyu Liu, Romain Sauvestre, Tal Remez, Jérémy Rapin, Artyom Kozhevnikov, Ivan Evtimov, Joanna Bitton, Manish Bhatt, Cristian Canton Ferrer, Aaron Grattafiori, Wenhan Xiong, Alexandre Défossez, Jade Copet, Faisal Azhar, Hugo Touvron, Louis Martin, Nicolas Usunier, Thomas Scialom, Gabriel Synnaeve

We release Code Llama, a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks.

16k Code Generation +1

EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis

no code implementations10 Aug 2023 Tu Anh Nguyen, Wei-Ning Hsu, Antony D'Avirro, Bowen Shi, Itai Gat, Maryam Fazel-Zarani, Tal Remez, Jade Copet, Gabriel Synnaeve, Michael Hassid, Felix Kreuk, Yossi Adi, Emmanuel Dupoux

Recent work has shown that it is possible to resynthesize high-quality speech based, not on text, but on low bitrate discrete units that have been learned in a self-supervised fashion and can therefore capture expressive aspects of speech that are hard to transcribe (prosody, voice styles, non-verbal vocalization).

Resynthesis Speech Synthesis

Contrastive Distillation Is a Sample-Efficient Self-Supervised Loss Policy for Transfer Learning

no code implementations21 Dec 2022 Chris Lengerich, Gabriel Synnaeve, Amy Zhang, Hugh Leather, Kurt Shuster, François Charton, Charysse Redwood

Traditional approaches to RL have focused on learning decision policies directly from episodic decisions, while slowly and implicitly learning the semantics of compositional representations needed for generalization.

Few-Shot Learning Language Modelling +2

Leveraging Demonstrations with Latent Space Priors

1 code implementation26 Oct 2022 Jonas Gehring, Deepak Gopinath, Jungdam Won, Andreas Krause, Gabriel Synnaeve, Nicolas Usunier

Starting with a learned joint latent space, we separately train a generative model of demonstration sequences and an accompanying low-level policy.

Offline RL

High Fidelity Neural Audio Compression

2 code implementations24 Oct 2022 Alexandre Défossez, Jade Copet, Gabriel Synnaeve, Yossi Adi

We introduce a state-of-the-art real-time, high-fidelity, audio codec leveraging neural networks.

Audio Compression Vocal Bursts Intensity Prediction

AudioGen: Textually Guided Audio Generation

1 code implementation30 Sep 2022 Felix Kreuk, Gabriel Synnaeve, Adam Polyak, Uriel Singer, Alexandre Défossez, Jade Copet, Devi Parikh, Yaniv Taigman, Yossi Adi

Finally, we explore the ability of the proposed method to generate audio continuation conditionally and unconditionally.

Audio Generation Descriptive

Code Translation with Compiler Representations

1 code implementation30 Jun 2022 Marc Szafraniec, Baptiste Roziere, Hugh Leather, Francois Charton, Patrick Labatut, Gabriel Synnaeve

Here we propose to augment code translation with IRs, specifically LLVM IR, with results on the C++, Java, Rust, and Go languages.

Code Translation Machine Translation +2

Pseudo-Labeling for Massively Multilingual Speech Recognition

no code implementations30 Oct 2021 Loren Lugosch, Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert

Semi-supervised learning through pseudo-labeling has become a staple of state-of-the-art monolingual speech recognition systems.

speech-recognition Speech Recognition

Hierarchical Skills for Efficient Exploration

1 code implementation NeurIPS 2021 Jonas Gehring, Gabriel Synnaeve, Andreas Krause, Nicolas Usunier

We alleviate the need for prior knowledge by proposing a hierarchical skill learning framework that acquires skills of varying complexity in an unsupervised manner.

Continuous Control Efficient Exploration +4

ASR4REAL: An extended benchmark for speech models

no code implementations16 Oct 2021 Morgane Riviere, Jade Copet, Gabriel Synnaeve

Popular ASR benchmarks such as Librispeech and Switchboard are limited in the diversity of settings and speakers they represent.

Language Modelling

Word Order Does Not Matter For Speech Recognition

no code implementations12 Oct 2021 Vineel Pratap, Qiantong Xu, Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert

In this paper, we study training of automatic speech recognition system in a weakly supervised setting where the order of words in transcript labels of the audio training data is not known.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

XCiT: Cross-Covariance Image Transformers

11 code implementations NeurIPS 2021 Alaaeldin El-Nouby, Hugo Touvron, Mathilde Caron, Piotr Bojanowski, Matthijs Douze, Armand Joulin, Ivan Laptev, Natalia Neverova, Gabriel Synnaeve, Jakob Verbeek, Hervé Jegou

We propose a "transposed" version of self-attention that operates across feature channels rather than tokens, where the interactions are based on the cross-covariance matrix between keys and queries.

Instance Segmentation object-detection +3

Gradient Matching for Domain Generalization

2 code implementations ICLR 2022 Yuge Shi, Jeffrey Seely, Philip H. S. Torr, N. Siddharth, Awni Hannun, Nicolas Usunier, Gabriel Synnaeve

We perform experiments on both the Wilds benchmark, which captures distribution shift in the real world, as well as datasets in DomainBed benchmark that focuses more on synthetic-to-real transfer.

Domain Generalization

Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training

3 code implementations2 Apr 2021 Wei-Ning Hsu, Anuroop Sriram, Alexei Baevski, Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Jacob Kahn, Ann Lee, Ronan Collobert, Gabriel Synnaeve, Michael Auli

On a large-scale competitive setup, we show that pre-training on unlabeled in-domain data reduces the gap between models trained on in-domain and out-of-domain labeled data by 66%-73%.

Self-Supervised Learning

Going deeper with Image Transformers

19 code implementations ICCV 2021 Hugo Touvron, Matthieu Cord, Alexandre Sablayrolles, Gabriel Synnaeve, Hervé Jégou

In particular, we investigate the interplay of architecture and optimization of such dedicated transformers.

Ranked #5 on Image Classification on CIFAR-10 (using extra training data)

Image Classification Transfer Learning

ROMUL: Scale Adaptative Population Based Training

no code implementations1 Jan 2021 Daniel Haziza, Jérémy Rapin, Gabriel Synnaeve

In most pragmatic settings, data augmentation and regularization are essential, and require hyperparameter search.

Data Augmentation Image Classification +1

Joint Masked CPC and CTC Training for ASR

1 code implementation30 Oct 2020 Chaitanya Talnikar, Tatiana Likhomanenko, Ronan Collobert, Gabriel Synnaeve

Self-supervised learning (SSL) has shown promise in learning representations of audio that are useful for automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

SlimIPL: Language-Model-Free Iterative Pseudo-Labeling

no code implementations22 Oct 2020 Tatiana Likhomanenko, Qiantong Xu, Jacob Kahn, Gabriel Synnaeve, Ronan Collobert

We improve upon the IPL algorithm: as the model learns, we propose to iteratively re-generate transcriptions with hard labels (the most probable tokens), that is, without a language model.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Self-training and Pre-training are Complementary for Speech Recognition

3 code implementations22 Oct 2020 Qiantong Xu, Alexei Baevski, Tatiana Likhomanenko, Paden Tomasello, Alexis Conneau, Ronan Collobert, Gabriel Synnaeve, Michael Auli

Self-training and unsupervised pre-training have emerged as effective approaches to improve speech recognition systems using unlabeled data.

 Ranked #1 on Speech Recognition on LibriSpeech train-clean-100 test-other (using extra training data)

speech-recognition Speech Recognition +1

Rethinking Evaluation in ASR: Are Our Models Robust Enough?

1 code implementation22 Oct 2020 Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Paden Tomasello, Jacob Kahn, Gilad Avidov, Ronan Collobert, Gabriel Synnaeve

Finally, we show that training a single acoustic model on the most widely-used datasets - combined - reaches competitive performance on both research and real-world benchmarks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Population Based Training for Data Augmentation and Regularization in Speech Recognition

no code implementations8 Oct 2020 Daniel Haziza, Jérémy Rapin, Gabriel Synnaeve

It compares favorably to a baseline that does not change those hyperparameters over the course of training, with an 8% relative WER improvement.

Data Augmentation speech-recognition +1

Massively Multilingual ASR: 50 Languages, 1 Model, 1 Billion Parameters

no code implementations6 Jul 2020 Vineel Pratap, Anuroop Sriram, Paden Tomasello, Awni Hannun, Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

We study training a single acoustic model for multiple languages with the aim of improving automatic speech recognition (ASR) performance on low-resource languages, and over-all simplifying deployment of ASR systems that support diverse languages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Data Augmenting Contrastive Learning of Speech Representations in the Time Domain

1 code implementation2 Jul 2020 Eugene Kharitonov, Morgane Rivière, Gabriel Synnaeve, Lior Wolf, Pierre-Emmanuel Mazaré, Matthijs Douze, Emmanuel Dupoux

Contrastive Predictive Coding (CPC), based on predicting future segments of speech based on past segments is emerging as a powerful algorithm for representation learning of speech signal.

Contrastive Learning Data Augmentation +1

Real Time Speech Enhancement in the Waveform Domain

3 code implementations23 Jun 2020 Alexandre Defossez, Gabriel Synnaeve, Yossi Adi

The proposed model matches state-of-the-art performance of both causal and non causal methods while working directly on the raw waveform.

Data Augmentation Speech Enhancement

Semi-Supervised Speech Recognition via Local Prior Matching

1 code implementation24 Feb 2020 Wei-Ning Hsu, Ann Lee, Gabriel Synnaeve, Awni Hannun

For sequence transduction tasks like speech recognition, a strong structured prior model encodes rich information about the target space, implicitly ruling out invalid sequences by assigning them low probability.

Knowledge Distillation Language Modelling +2

Scaling Up Online Speech Recognition Using ConvNets

no code implementations27 Jan 2020 Vineel Pratap, Qiantong Xu, Jacob Kahn, Gilad Avidov, Tatiana Likhomanenko, Awni Hannun, Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

We design an online end-to-end speech recognition system based on Time-Depth Separable (TDS) convolutions and Connectionist Temporal Classification (CTC).

speech-recognition Speech Recognition

Libri-Light: A Benchmark for ASR with Limited or No Supervision

2 code implementations17 Dec 2019 Jacob Kahn, Morgane Rivière, Weiyi Zheng, Evgeny Kharitonov, Qiantong Xu, Pierre-Emmanuel Mazaré, Julien Karadayi, Vitaliy Liptchinsky, Ronan Collobert, Christian Fuegen, Tatiana Likhomanenko, Gabriel Synnaeve, Armand Joulin, Abdel-rahman Mohamed, Emmanuel Dupoux

Additionally, we provide baseline systems and evaluation metrics working under three settings: (1) the zero resource/unsupervised setting (ABX), (2) the semi-supervised setting (PER, CER) and (3) the distant supervision setting (WER).

 Ranked #1 on Speech Recognition on Libri-Light test-other (ABX-within metric)

speech-recognition Speech Recognition

End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures

1 code implementation19 Nov 2019 Gabriel Synnaeve, Qiantong Xu, Jacob Kahn, Tatiana Likhomanenko, Edouard Grave, Vineel Pratap, Anuroop Sriram, Vitaliy Liptchinsky, Ronan Collobert

We study pseudo-labeling for the semi-supervised training of ResNet, Time-Depth Separable ConvNets, and Transformers for speech recognition, with either CTC or Seq2Seq loss functions.

Ranked #16 on Speech Recognition on LibriSpeech test-other (using extra training data)

Language Modelling speech-recognition +1

Deja-vu: Double Feature Presentation and Iterated Loss in Deep Transformer Networks

1 code implementation23 Oct 2019 Andros Tjandra, Chunxi Liu, Frank Zhang, Xiaohui Zhang, Yongqiang Wang, Gabriel Synnaeve, Satoshi Nakamura, Geoffrey Zweig

As our motivation is to allow acoustic models to re-examine their input features in light of partial hypotheses we introduce intermediate model heads and loss function.

A Structured Prediction Approach for Generalization in Cooperative Multi-Agent Reinforcement Learning

1 code implementation NeurIPS 2019 Nicolas Carion, Gabriel Synnaeve, Alessandro Lazaric, Nicolas Usunier

While centralized reinforcement learning methods can optimally solve small MAC instances, they do not scale to large problems and they fail to generalize to scenarios different from those seen during training.

Multi-agent Reinforcement Learning reinforcement-learning +4

Why Build an Assistant in Minecraft?

1 code implementation22 Jul 2019 Arthur Szlam, Jonathan Gray, Kavya Srinet, Yacine Jernite, Armand Joulin, Gabriel Synnaeve, Douwe Kiela, Haonan Yu, Zhuoyuan Chen, Siddharth Goyal, Demi Guo, Danielle Rothermel, C. Lawrence Zitnick, Jason Weston

In this document we describe a rationale for a research program aimed at building an open "assistant" in the game Minecraft, in order to make progress on the problems of natural language understanding and learning from dialogue.

Natural Language Understanding

Growing Action Spaces

1 code implementation ICML 2020 Gregory Farquhar, Laura Gustafson, Zeming Lin, Shimon Whiteson, Nicolas Usunier, Gabriel Synnaeve

In complex tasks, such as those with large combinatorial action spaces, random exploration may be too inefficient to achieve meaningful learning progress.

reinforcement-learning Reinforcement Learning (RL) +1

Who Needs Words? Lexicon-Free Speech Recognition

no code implementations9 Apr 2019 Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert

Lexicon-free speech recognition naturally deals with the problem of out-of-vocabulary (OOV) words.

speech-recognition Speech Recognition

A Fully Differentiable Beam Search Decoder

1 code implementation16 Feb 2019 Ronan Collobert, Awni Hannun, Gabriel Synnaeve

We demonstrate our approach scales by applying it to speech recognition, jointly training acoustic and word-level language models.

Language Modelling speech-recognition +1

Fully Convolutional Speech Recognition

no code implementations17 Dec 2018 Neil Zeghidour, Qiantong Xu, Vitaliy Liptchinsky, Nicolas Usunier, Gabriel Synnaeve, Ronan Collobert

In this paper we present an alternative approach based solely on convolutional neural networks, leveraging recent advances in acoustic models from the raw waveform and language modeling.

Language Modelling speech-recognition +1

To Reverse the Gradient or Not: An Empirical Comparison of Adversarial and Multi-task Learning in Speech Recognition

no code implementations9 Dec 2018 Yossi Adi, Neil Zeghidour, Ronan Collobert, Nicolas Usunier, Vitaliy Liptchinsky, Gabriel Synnaeve

In multi-task learning, the goal is speaker prediction; we expect a performance improvement with this joint training if the two tasks of speech recognition and speaker recognition share a common set of underlying features.

Multi-Task Learning Speaker Recognition +2

Forward Modeling for Partial Observation Strategy Games - A StarCraft Defogger

1 code implementation ICLR 2018 Gabriel Synnaeve, Zeming Lin, Jonas Gehring, Dan Gant, Vegard Mella, Vasil Khalidov, Nicolas Carion, Nicolas Usunier

We formulate the problem of defogging as state estimation and future state prediction from previous, partial observations in the context of real-time strategy games.

Starcraft

High-Level Strategy Selection under Partial Observability in StarCraft: Brood War

no code implementations21 Nov 2018 Jonas Gehring, Da Ju, Vegard Mella, Daniel Gant, Nicolas Usunier, Gabriel Synnaeve

We consider the problem of high-level strategy selection in the adversarial setting of real-time strategy games from a reinforcement learning perspective, where taking an action corresponds to switching to the respective strategy.

reinforcement-learning Reinforcement Learning (RL) +2

End-to-End Speech Recognition From the Raw Waveform

1 code implementation19 Jun 2018 Neil Zeghidour, Nicolas Usunier, Gabriel Synnaeve, Ronan Collobert, Emmanuel Dupoux

In this paper, we study end-to-end systems trained directly from the raw waveform, building on two alternatives for trainable replacements of mel-filterbanks that use a convolutional architecture.

speech-recognition Speech Recognition

Value Propagation Networks

no code implementations ICLR 2018 Nantas Nardelli, Gabriel Synnaeve, Zeming Lin, Pushmeet Kohli, Philip H. S. Torr, Nicolas Usunier

We present Value Propagation (VProp), a set of parameter-efficient differentiable planning modules built on Value Iteration which can successfully be trained using reinforcement learning to solve unseen tasks, has the capability to generalize to larger map sizes, and can learn to navigate in dynamic environments.

Navigate reinforcement-learning +2

Gated ConvNets for Letter-Based ASR

no code implementations ICLR 2018 Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

In this paper we introduce a new speech recognition system, leveraging a simple letter-based ConvNet acoustic model.

Language Modelling speech-recognition +1

Letter-Based Speech Recognition with Gated ConvNets

2 code implementations22 Dec 2017 Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

In the recent literature, "end-to-end" speech systems often refer to letter-based acoustic models trained in a sequence-to-sequence manner, either via a recurrent model or via a structured output learning approach (such as CTC).

Language Modelling speech-recognition +1

Learning Filterbanks from Raw Speech for Phone Recognition

2 code implementations3 Nov 2017 Neil Zeghidour, Nicolas Usunier, Iasonas Kokkinos, Thomas Schatz, Gabriel Synnaeve, Emmanuel Dupoux

We train a bank of complex filters that operates on the raw waveform and is fed into a convolutional neural network for end-to-end phone recognition.

STARDATA: A StarCraft AI Research Dataset

1 code implementation7 Aug 2017 Zeming Lin, Jonas Gehring, Vasil Khalidov, Gabriel Synnaeve

We provide full game state data along with the original replays that can be viewed in StarCraft.

Imitation Learning Starcraft +1

Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play

3 code implementations ICLR 2018 Sainbayar Sukhbaatar, Zeming Lin, Ilya Kostrikov, Gabriel Synnaeve, Arthur Szlam, Rob Fergus

When Bob is deployed on an RL task within the environment, this unsupervised training reduces the number of supervised episodes needed to learn, and in some cases converges to a higher reward.

TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games

2 code implementations1 Nov 2016 Gabriel Synnaeve, Nantas Nardelli, Alex Auvolat, Soumith Chintala, Timothée Lacroix, Zeming Lin, Florian Richoux, Nicolas Usunier

We present TorchCraft, a library that enables deep learning research on Real-Time Strategy (RTS) games such as StarCraft: Brood War, by making it easier to control these games from a machine learning framework, here Torch.

BIG-bench Machine Learning Starcraft

Wav2Letter: an End-to-End ConvNet-based Speech Recognition System

9 code implementations arXiv 2016 Ronan Collobert, Christian Puhrsch, Gabriel Synnaeve

This paper presents a simple end-to-end model for speech recognition, combining a convolutional network based acoustic model and a graph decoding.

Speech Recognition

MazeBase: A Sandbox for Learning from Games

2 code implementations23 Nov 2015 Sainbayar Sukhbaatar, Arthur Szlam, Gabriel Synnaeve, Soumith Chintala, Rob Fergus

This paper introduces MazeBase: an environment for simple 2D games, designed as a sandbox for machine learning approaches to reasoning and planning.

Negation Reinforcement Learning (RL) +1

Weakly Supervised Multi-Embeddings Learning of Acoustic Models

no code implementations20 Dec 2014 Gabriel Synnaeve, Emmanuel Dupoux

We trained a Siamese network with multi-task same/different information on a speech dataset, and found that it was possible to share a network for both tasks without a loss in performance.

A Dataset for StarCraft AI \& an Example of Armies Clustering

1 code implementation19 Nov 2012 Gabriel Synnaeve, Pierre Bessiere

We evaluated this clustering method by predicting the outcomes of battles based on armies compositions' mixtures components

Clustering Starcraft

Cannot find the paper you are looking for? You can Submit a new open access paper.