Search Results for author: Gabriel Synnaeve

Found 82 papers, 51 papers with code

SpiRit-LM: Interleaved Spoken and Written Language Model

no code implementations • 8 Feb 2024 • Tu Anh Nguyen, Benjamin Muller, Bokai Yu, Marta R. Costa-Jussa, Maha Elbayad, Sravya Popuri, Paul-Ambroise Duquenne, Robin Algayres, Ruslan Mavlyutov, Itai Gat, Gabriel Synnaeve, Juan Pino, Benoit Sagot, Emmanuel Dupoux

We introduce SPIRIT-LM, a foundation multimodal language model that freely mixes text and speech.

Language Modelling

Paper
Add Code

Getting the most out of your tokenizer for pre-training and domain adaptation

1 code implementation • 1 Feb 2024 • Gautier Dagan, Gabriel Synnaeve, Baptiste Rozière

Tokenization is an understudied and often neglected component of modern LLMs.

Code Generation Domain Adaptation

Paper
Code

Masked Audio Generation using a Single Non-Autoregressive Transformer

no code implementations • 9 Jan 2024 • Alon Ziv, Itai Gat, Gael Le Lan, Tal Remez, Felix Kreuk, Alexandre Défossez, Jade Copet, Gabriel Synnaeve, Yossi Adi

We introduce MAGNeT, a masked generative sequence modeling method that operates directly over several streams of audio tokens.

Audio Generation

Paper
Add Code

CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution

no code implementations • 5 Jan 2024 • Alex Gu, Baptiste Rozière, Hugh Leather, Armando Solar-Lezama, Gabriel Synnaeve, Sida I. Wang

The best setup, GPT-4 with chain of thought (CoT), achieves a pass@1 of 75% and 81% on input and output prediction, respectively.

Paper
Add Code

Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models

no code implementations • 7 Dec 2023 • Manish Bhatt, Sahana Chennabasappa, Cyrus Nikolaidis, Shengye Wan, Ivan Evtimov, Dominik Gabi, Daniel Song, Faizan Ahmad, Cornelius Aschermann, Lorenzo Fontana, Sasha Frolov, Ravi Prakash Giri, Dhaval Kapil, Yiannis Kozyrakis, David LeBlanc, James Milazzo, Aleksandar Straumann, Gabriel Synnaeve, Varun Vontimitta, Spencer Whitman, Joshua Saxe

This paper presents CyberSecEval, a comprehensive benchmark developed to help bolster the cybersecurity of Large Language Models (LLMs) employed as coding assistants.

Language Modelling Large Language Model

Paper
Add Code

Generative Spoken Language Model based on continuous word-sized audio tokens

no code implementations • 8 Oct 2023 • Robin Algayres, Yossi Adi, Tu Anh Nguyen, Jade Copet, Gabriel Synnaeve, Benoit Sagot, Emmanuel Dupoux

In NLP, text language models based on words or subwords are known to outperform their character-based counterparts.

Language Modelling

Paper
Add Code

A Data Source for Reasoning Embodied Agents

1 code implementation • 14 Sep 2023 • Jack Lanchantin, Sainbayar Sukhbaatar, Gabriel Synnaeve, Yuxuan Sun, Kavya Srinet, Arthur Szlam

In this work, to further pursue these advances, we introduce a new data generator for machine reasoning that integrates with an embodied agent.

Paper
Code

Large Language Models for Compiler Optimization

no code implementations • 11 Sep 2023 • Chris Cummins, Volker Seeker, Dejan Grubisic, Mostafa Elhoushi, Youwei Liang, Baptiste Roziere, Jonas Gehring, Fabian Gloeckle, Kim Hazelwood, Gabriel Synnaeve, Hugh Leather

We explore the novel application of Large Language Models to code optimization.

Auxiliary Learning Compiler Optimization

Paper
Add Code

Code Llama: Open Foundation Models for Code

2 code implementations • 24 Aug 2023 • Baptiste Rozière, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai Gat, Xiaoqing Ellen Tan, Yossi Adi, Jingyu Liu, Romain Sauvestre, Tal Remez, Jérémy Rapin, Artyom Kozhevnikov, Ivan Evtimov, Joanna Bitton, Manish Bhatt, Cristian Canton Ferrer, Aaron Grattafiori, Wenhan Xiong, Alexandre Défossez, Jade Copet, Faisal Azhar, Hugo Touvron, Louis Martin, Nicolas Usunier, Thomas Scialom, Gabriel Synnaeve

We release Code Llama, a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks.

Ranked #12 on Code Generation on HumanEval

16k Code Generation +1

14,330

Paper
Code

EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis

no code implementations • 10 Aug 2023 • Tu Anh Nguyen, Wei-Ning Hsu, Antony D'Avirro, Bowen Shi, Itai Gat, Maryam Fazel-Zarani, Tal Remez, Jade Copet, Gabriel Synnaeve, Michael Hassid, Felix Kreuk, Yossi Adi, Emmanuel Dupoux

Recent work has shown that it is possible to resynthesize high-quality speech based, not on text, but on low bitrate discrete units that have been learned in a self-supervised fashion and can therefore capture expressive aspects of speech that are hard to transcribe (prosody, voice styles, non-verbal vocalization).

Resynthesis Speech Synthesis

Paper
Add Code

Simple and Controllable Music Generation

2 code implementations • NeurIPS 2023 • Jade Copet, Felix Kreuk, Itai Gat, Tal Remez, David Kant, Gabriel Synnaeve, Yossi Adi, Alexandre Défossez

We tackle the task of conditional music generation.

Ranked #4 on Text-to-Music Generation on MusicCaps

Language Modelling Music Generation +1

19,547

Paper
Code

Textually Pretrained Speech Language Models

1 code implementation • NeurIPS 2023 • Michael Hassid, Tal Remez, Tu Anh Nguyen, Itai Gat, Alexis Conneau, Felix Kreuk, Jade Copet, Alexandre Defossez, Gabriel Synnaeve, Emmanuel Dupoux, Roy Schwartz, Yossi Adi

In this work, we propose TWIST, a method for training SpeechLMs using a warm-start from a pretrained textual language models.

Paper
Code

DINOv2: Learning Robust Visual Features without Supervision

11 code implementations • 14 Apr 2023 • Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mahmoud Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Hervé Jegou, Julien Mairal, Patrick Labatut, Armand Joulin, Piotr Bojanowski

The recent breakthroughs in natural language processing for model pretraining on large quantities of data have opened the way for similar foundation models in computer vision.

Ranked #1 on Image Classification on CIFAR-10

Domain Generalization Fine-Grained Image Classification +5

124,527

Paper
Code

Contrastive Distillation Is a Sample-Efficient Self-Supervised Loss Policy for Transfer Learning

no code implementations • 21 Dec 2022 • Chris Lengerich, Gabriel Synnaeve, Amy Zhang, Hugh Leather, Kurt Shuster, François Charton, Charysse Redwood

Traditional approaches to RL have focused on learning decision policies directly from episodic decisions, while slowly and implicitly learning the semantics of compositional representations needed for generalization.

Few-Shot Learning Language Modelling +2

Paper
Add Code

Leveraging Demonstrations with Latent Space Priors

1 code implementation • 26 Oct 2022 • Jonas Gehring, Deepak Gopinath, Jungdam Won, Andreas Krause, Gabriel Synnaeve, Nicolas Usunier

Starting with a learned joint latent space, we separately train a generative model of demonstration sequences and an accompanying low-level policy.

Offline RL

Paper
Code

High Fidelity Neural Audio Compression

2 code implementations • 24 Oct 2022 • Alexandre Défossez, Jade Copet, Gabriel Synnaeve, Yossi Adi

We introduce a state-of-the-art real-time, high-fidelity, audio codec leveraging neural networks.

Audio Compression Vocal Bursts Intensity Prediction

3,154

Paper
Code

Augmentation Invariant Discrete Representation for Generative Spoken Language Modeling

no code implementations • 30 Sep 2022 • Itai Gat, Felix Kreuk, Tu Anh Nguyen, Ann Lee, Jade Copet, Gabriel Synnaeve, Emmanuel Dupoux, Yossi Adi

This work focuses on improving the robustness of discrete input representations for generative spoken language modeling.

Language Modelling Speech-to-Speech Translation

Paper
Add Code

AudioGen: Textually Guided Audio Generation

1 code implementation • 30 Sep 2022 • Felix Kreuk, Gabriel Synnaeve, Adam Polyak, Uriel Singer, Alexandre Défossez, Jade Copet, Devi Parikh, Yaniv Taigman, Yossi Adi

Finally, we explore the ability of the proposed method to generate audio continuation conditionally and unconditionally.

Ranked #12 on Audio Generation on AudioCaps

Audio Generation Descriptive

19,547

Paper
Code

Code Translation with Compiler Representations

1 code implementation • 30 Jun 2022 • Marc Szafraniec, Baptiste Roziere, Hugh Leather, Francois Charton, Patrick Labatut, Gabriel Synnaeve

Here we propose to augment code translation with IRs, specifically LLVM IR, with results on the C++, Java, Rust, and Go languages.

Code Translation Machine Translation +2

672

Paper
Code

Flashlight: Enabling Innovation in Tools for Machine Learning

2 code implementations • 29 Jan 2022 • Jacob Kahn, Vineel Pratap, Tatiana Likhomanenko, Qiantong Xu, Awni Hannun, Jeff Cai, Paden Tomasello, Ann Lee, Edouard Grave, Gilad Avidov, Benoit Steiner, Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

This is in part due to the difficulties involved in prototyping new computational paradigms with existing frameworks.

BIG-bench Machine Learning

5,142

Paper
Code

Star Temporal Classification: Sequence Classification with Partially Labeled Data

1 code implementation • 28 Jan 2022 • Vineel Pratap, Awni Hannun, Gabriel Synnaeve, Ronan Collobert

These experiments show that STC can recover most of the performance of supervised baseline when up to 70% of the labels are missing.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

Augmenting Convolutional networks with attention-based aggregation

5 code implementations • 27 Dec 2021 • Hugo Touvron, Matthieu Cord, Alaaeldin El-Nouby, Piotr Bojanowski, Armand Joulin, Gabriel Synnaeve, Hervé Jégou

We show how to augment any convolutional network with an attention-based global map to achieve non-local reasoning.

Ranked #38 on Semantic Segmentation on ADE20K val

Classification Image Classification +3

3,856

Paper
Code

Pseudo-Labeling for Massively Multilingual Speech Recognition

no code implementations • 30 Oct 2021 • Loren Lugosch, Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert

Semi-supervised learning through pseudo-labeling has become a staple of state-of-the-art monolingual speech recognition systems.

speech-recognition Speech Recognition

Paper
Add Code

Hierarchical Skills for Efficient Exploration

1 code implementation • NeurIPS 2021 • Jonas Gehring, Gabriel Synnaeve, Andreas Krause, Nicolas Usunier

We alleviate the need for prior knowledge by proposing a hierarchical skill learning framework that acquires skills of varying complexity in an unsupervised manner.

Continuous Control Efficient Exploration +4

Paper
Code

ASR4REAL: An extended benchmark for speech models

no code implementations • 16 Oct 2021 • Morgane Riviere, Jade Copet, Gabriel Synnaeve

Popular ASR benchmarks such as Librispeech and Switchboard are limited in the diversity of settings and speakers they represent.

Language Modelling

Paper
Add Code

Leveraging Automated Unit Tests for Unsupervised Code Translation

1 code implementation • ICLR 2022 • Baptiste Roziere, Jie M. Zhang, Francois Charton, Mark Harman, Gabriel Synnaeve, Guillaume Lample

With little to no parallel data available for programming languages, unsupervised methods are well-suited to source code translation.

Code Translation Sentence +2

672

Paper
Code

Word Order Does Not Matter For Speech Recognition

no code implementations • 12 Oct 2021 • Vineel Pratap, Qiantong Xu, Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert

In this paper, we study training of automatic speech recognition system in a weakly supervised setting where the order of words in transcript labels of the audio training data is not known.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

XCiT: Cross-Covariance Image Transformers

11 code implementations • NeurIPS 2021 • Alaaeldin El-Nouby, Hugo Touvron, Mathilde Caron, Piotr Bojanowski, Matthijs Douze, Armand Joulin, Ivan Laptev, Natalia Neverova, Gabriel Synnaeve, Jakob Verbeek, Hervé Jegou

We propose a "transposed" version of self-attention that operates across feature channels rather than tokens, where the interactions are based on the cross-covariance matrix between keys and queries.

Ranked #55 on Instance Segmentation on COCO minival

Instance Segmentation object-detection +3

29,671

Paper
Code

CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings

1 code implementation • NeurIPS 2021 • Tatiana Likhomanenko, Qiantong Xu, Gabriel Synnaeve, Ronan Collobert, Alex Rogozhnikov

Absolute or relative positional embeddings are the most popular ways to feed Transformer models with positional information.

Machine Translation speech-recognition +2

Paper
Code

ResMLP: Feedforward networks for image classification with data-efficient training

15 code implementations • NeurIPS 2021 • Hugo Touvron, Piotr Bojanowski, Mathilde Caron, Matthieu Cord, Alaaeldin El-Nouby, Edouard Grave, Gautier Izacard, Armand Joulin, Gabriel Synnaeve, Jakob Verbeek, Hervé Jégou

We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification.

Ranked #1 on Image Classification on Certificate Verification

Data Augmentation Fine-Grained Image Classification +4

29,671

Paper
Code

MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding

3 code implementations • 26 Apr 2021 • Aishwarya Kamath, Mannat Singh, Yann Lecun, Gabriel Synnaeve, Ishan Misra, Nicolas Carion

We also investigate the utility of our model as an object detector on a given label set when fine-tuned in a few-shot setting.

Ranked #1 on Visual Question Answering (VQA) on CLEVR-Humans

Generalized Referring Expression Comprehension Phrase Grounding +9

1,286

Paper
Code

Gradient Matching for Domain Generalization

2 code implementations • ICLR 2022 • Yuge Shi, Jeffrey Seely, Philip H. S. Torr, N. Siddharth, Awni Hannun, Nicolas Usunier, Gabriel Synnaeve

We perform experiments on both the Wilds benchmark, which captures distribution shift in the real world, as well as datasets in DomainBed benchmark that focuses more on synthetic-to-real transfer.

Domain Generalization

1,328

Paper
Code

Differentiable Model Compression via Pseudo Quantization Noise

1 code implementation • 20 Apr 2021 • Alexandre Défossez, Yossi Adi, Gabriel Synnaeve

DiffQ is differentiable both with respect to the unquantized weights and the number of bits used.

Ranked #29 on Language Modelling on WikiText-103

Audio Source Separation Image Classification +3

229

Paper
Code

Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training

3 code implementations • 2 Apr 2021 • Wei-Ning Hsu, Anuroop Sriram, Alexei Baevski, Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Jacob Kahn, Ann Lee, Ronan Collobert, Gabriel Synnaeve, Michael Auli

On a large-scale competitive setup, we show that pre-training on unlabeled in-domain data reduces the gap between models trained on in-domain and out-of-domain labeled data by 66%-73%.

Self-Supervised Learning

29,192

Paper
Code

Going deeper with Image Transformers

19 code implementations • ICCV 2021 • Hugo Touvron, Matthieu Cord, Alexandre Sablayrolles, Gabriel Synnaeve, Hervé Jégou

In particular, we investigate the interplay of architecture and optimization of such dedicated transformers.

Ranked #5 on Image Classification on CIFAR-10 (using extra training data)

Image Classification Transfer Learning

29,671

Paper
Code

MDETR - Modulated Detection for End-to-End Multi-Modal Understanding

1 code implementation • ICCV 2021 • Aishwarya Kamath, Mannat Singh, Yann Lecun, Gabriel Synnaeve, Ishan Misra, Nicolas Carion

We also investigate the utility of our model as an object detector on a given label set when fine-tuned in a few-shot setting.

Ranked #2 on Referring Expression Comprehension on Talk2Car (using extra training data)

Phrase Grounding Question Answering +3

936

Paper
Code

ROMUL: Scale Adaptative Population Based Training

no code implementations • 1 Jan 2021 • Daniel Haziza, Jérémy Rapin, Gabriel Synnaeve

In most pragmatic settings, data augmentation and regularization are essential, and require hyperparameter search.

Data Augmentation Image Classification +1

Paper
Add Code

MLS: A Large-Scale Multilingual Dataset for Speech Research

1 code implementation • 7 Dec 2020 • Vineel Pratap, Qiantong Xu, Anuroop Sriram, Gabriel Synnaeve, Ronan Collobert

This paper introduces Multilingual LibriSpeech (MLS) dataset, a large multilingual corpus suitable for speech research.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

6,331

Paper
Code

Joint Masked CPC and CTC Training for ASR

1 code implementation • 30 Oct 2020 • Chaitanya Talnikar, Tatiana Likhomanenko, Ronan Collobert, Gabriel Synnaeve

Self-supervised learning (SSL) has shown promise in learning representations of audio that are useful for automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

491

Paper
Code

SlimIPL: Language-Model-Free Iterative Pseudo-Labeling

no code implementations • 22 Oct 2020 • Tatiana Likhomanenko, Qiantong Xu, Jacob Kahn, Gabriel Synnaeve, Ronan Collobert

We improve upon the IPL algorithm: as the model learns, we propose to iteratively re-generate transcriptions with hard labels (the most probable tokens), that is, without a language model.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Self-training and Pre-training are Complementary for Speech Recognition

3 code implementations • 22 Oct 2020 • Qiantong Xu, Alexei Baevski, Tatiana Likhomanenko, Paden Tomasello, Alexis Conneau, Ronan Collobert, Gabriel Synnaeve, Michael Auli

Self-training and unsupervised pre-training have emerged as effective approaches to improve speech recognition systems using unlabeled data.

Ranked #1 on Speech Recognition on LibriSpeech train-clean-100 test-other (using extra training data)

speech-recognition Speech Recognition +1

29,193

Paper
Code

Rethinking Evaluation in ASR: Are Our Models Robust Enough?

1 code implementation • 22 Oct 2020 • Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Paden Tomasello, Jacob Kahn, Gilad Avidov, Ronan Collobert, Gabriel Synnaeve

Finally, we show that training a single acoustic model on the most widely-used datasets - combined - reaches competitive performance on both research and real-world benchmarks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

6,331

Paper
Code

Population Based Training for Data Augmentation and Regularization in Speech Recognition

no code implementations • 8 Oct 2020 • Daniel Haziza, Jérémy Rapin, Gabriel Synnaeve

It compares favorably to a baseline that does not change those hyperparameters over the course of training, with an 8% relative WER improvement.

Data Augmentation speech-recognition +1

Paper
Add Code

Massively Multilingual ASR: 50 Languages, 1 Model, 1 Billion Parameters

no code implementations • 6 Jul 2020 • Vineel Pratap, Anuroop Sriram, Paden Tomasello, Awni Hannun, Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

We study training a single acoustic model for multiple languages with the aim of improving automatic speech recognition (ASR) performance on low-resource languages, and over-all simplifying deployment of ASR systems that support diverse languages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Data Augmenting Contrastive Learning of Speech Representations in the Time Domain

1 code implementation • 2 Jul 2020 • Eugene Kharitonov, Morgane Rivière, Gabriel Synnaeve, Lior Wolf, Pierre-Emmanuel Mazaré, Matthijs Douze, Emmanuel Dupoux

Contrastive Predictive Coding (CPC), based on predicting future segments of speech based on past segments is emerging as a powerful algorithm for representation learning of speech signal.

Contrastive Learning Data Augmentation +1

626

Paper
Code

Real Time Speech Enhancement in the Waveform Domain

3 code implementations • 23 Jun 2020 • Alexandre Defossez, Gabriel Synnaeve, Yossi Adi

The proposed model matches state-of-the-art performance of both causal and non causal methods while working directly on the raw waveform.

Data Augmentation Speech Enhancement

1,557

Paper
Code

End-to-End Object Detection with Transformers

37 code implementations • ECCV 2020 • Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, Sergey Zagoruyko

We present a new method that views object detection as a direct set prediction problem.

Ranked #21 on Panoptic Segmentation on COCO minival

Object Panoptic Segmentation +1

124,527

Paper
Code

Iterative Pseudo-Labeling for Speech Recognition

1 code implementation • 19 May 2020 • Qiantong Xu, Tatiana Likhomanenko, Jacob Kahn, Awni Hannun, Gabriel Synnaeve, Ronan Collobert

In particular, IPL fine-tunes an existing model at each iteration using both labeled data and a subset of unlabeled data.

Ranked #11 on Speech Recognition on LibriSpeech test-other

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

6,331

Paper
Code

Contextualizing ASR Lattice Rescoring with Hybrid Pointer Network Language Model

no code implementations • 15 May 2020 • Da-Rong Liu, Chunxi Liu, Frank Zhang, Gabriel Synnaeve, Yatharth Saraf, Geoffrey Zweig

Videos uploaded on social media are often accompanied with textual descriptions.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Semi-Supervised Speech Recognition via Local Prior Matching

1 code implementation • 24 Feb 2020 • Wei-Ning Hsu, Ann Lee, Gabriel Synnaeve, Awni Hannun

For sequence transduction tasks like speech recognition, a strong structured prior model encodes rich information about the target space, implicitly ruling out invalid sequences by assigning them low probability.

Ranked #45 on Speech Recognition on LibriSpeech test-other

Knowledge Distillation Language Modelling +2

6,331

Paper
Code

Polygames: Improved Zero Learning

no code implementations • 27 Jan 2020 • Tristan Cazenave, Yen-Chi Chen, Guan-Wei Chen, Shi-Yu Chen, Xian-Dong Chiu, Julien Dehos, Maria Elsa, Qucheng Gong, Hengyuan Hu, Vasil Khalidov, Cheng-Ling Li, Hsin-I Lin, Yu-Jin Lin, Xavier Martinet, Vegard Mella, Jeremy Rapin, Baptiste Roziere, Gabriel Synnaeve, Fabien Teytaud, Olivier Teytaud, Shi-Cheng Ye, Yi-Jun Ye, Shi-Jim Yen, Sergey Zagoruyko

Since DeepMind's AlphaZero, Zero learning quickly became the state-of-the-art method for many board games.

Board Games

Paper
Add Code

Scaling Up Online Speech Recognition Using ConvNets

no code implementations • 27 Jan 2020 • Vineel Pratap, Qiantong Xu, Jacob Kahn, Gilad Avidov, Tatiana Likhomanenko, Awni Hannun, Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

We design an online end-to-end speech recognition system based on Time-Depth Separable (TDS) convolutions and Connectionist Temporal Classification (CTC).

speech-recognition Speech Recognition

Paper
Add Code

Libri-Light: A Benchmark for ASR with Limited or No Supervision

2 code implementations • 17 Dec 2019 • Jacob Kahn, Morgane Rivière, Weiyi Zheng, Evgeny Kharitonov, Qiantong Xu, Pierre-Emmanuel Mazaré, Julien Karadayi, Vitaliy Liptchinsky, Ronan Collobert, Christian Fuegen, Tatiana Likhomanenko, Gabriel Synnaeve, Armand Joulin, Abdel-rahman Mohamed, Emmanuel Dupoux

Additionally, we provide baseline systems and evaluation metrics working under three settings: (1) the zero resource/unsupervised setting (ABX), (2) the semi-supervised setting (PER, CER) and (3) the distant supervision setting (WER).

Ranked #1 on Speech Recognition on Libri-Light test-other (ABX-within metric)

speech-recognition Speech Recognition

444

Paper
Code

End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures

1 code implementation • 19 Nov 2019 • Gabriel Synnaeve, Qiantong Xu, Jacob Kahn, Tatiana Likhomanenko, Edouard Grave, Vineel Pratap, Anuroop Sriram, Vitaliy Liptchinsky, Ronan Collobert

We study pseudo-labeling for the semi-supervised training of ResNet, Time-Depth Separable ConvNets, and Transformers for speech recognition, with either CTC or Seq2Seq loss functions.

Ranked #16 on Speech Recognition on LibriSpeech test-other (using extra training data)

Language Modelling speech-recognition +1

5,333

Paper
Code

Deja-vu: Double Feature Presentation and Iterated Loss in Deep Transformer Networks

1 code implementation • 23 Oct 2019 • Andros Tjandra, Chunxi Liu, Frank Zhang, Xiaohui Zhang, Yongqiang Wang, Gabriel Synnaeve, Satoshi Nakamura, Geoffrey Zweig

As our motivation is to allow acoustic models to re-examine their input features in light of partial hypotheses we introduce intermediate model heads and loss function.

Paper
Code

A Structured Prediction Approach for Generalization in Cooperative Multi-Agent Reinforcement Learning

1 code implementation • NeurIPS 2019 • Nicolas Carion, Gabriel Synnaeve, Alessandro Lazaric, Nicolas Usunier

While centralized reinforcement learning methods can optimally solve small MAC instances, they do not scale to large problems and they fail to generalize to scenarios different from those seen during training.

Multi-agent Reinforcement Learning reinforcement-learning +4

644

Paper
Code

Self-Supervised Speech Recognition via Local Prior Matching

no code implementations • 25 Sep 2019 • Wei-Ning Hsu, Ann Lee, Gabriel Synnaeve, Awni Hannun

We propose local prior matching (LPM), a self-supervised objective for speech recognition.

Language Modelling speech-recognition +1

Paper
Add Code

Why Build an Assistant in Minecraft?

1 code implementation • 22 Jul 2019 • Arthur Szlam, Jonathan Gray, Kavya Srinet, Yacine Jernite, Armand Joulin, Gabriel Synnaeve, Douwe Kiela, Haonan Yu, Zhuoyuan Chen, Siddharth Goyal, Demi Guo, Danielle Rothermel, C. Lawrence Zitnick, Jason Weston

In this document we describe a rationale for a research program aimed at building an open "assistant" in the game Minecraft, in order to make progress on the problems of natural language understanding and learning from dialogue.

Natural Language Understanding

605

Paper
Code

Growing Action Spaces

1 code implementation • ICML 2020 • Gregory Farquhar, Laura Gustafson, Zeming Lin, Shimon Whiteson, Nicolas Usunier, Gabriel Synnaeve

In complex tasks, such as those with large combinatorial action spaces, random exploration may be too inefficient to achieve meaningful learning progress.

reinforcement-learning Reinforcement Learning (RL) +1

644

Paper
Code

Word-level Speech Recognition with a Letter to Word Encoder

no code implementations • ICML 2020 • Ronan Collobert, Awni Hannun, Gabriel Synnaeve

We propose a direct-to-word sequence model which uses a word network to learn word embeddings from letters.

General Classification speech-recognition +2

Paper
Add Code

Who Needs Words? Lexicon-Free Speech Recognition

no code implementations • 9 Apr 2019 • Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert

Lexicon-free speech recognition naturally deals with the problem of out-of-vocabulary (OOV) words.

speech-recognition Speech Recognition

Paper
Add Code

A Fully Differentiable Beam Search Decoder

1 code implementation • 16 Feb 2019 • Ronan Collobert, Awni Hannun, Gabriel Synnaeve

We demonstrate our approach scales by applying it to speech recognition, jointly training acoustic and word-level language models.

Language Modelling speech-recognition +1

137

Paper
Code

wav2letter++: The Fastest Open-source Speech Recognition System

8 code implementations • 18 Dec 2018 • Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert

This paper introduces wav2letter++, the fastest open-source deep learning speech recognition framework.

Speech Recognition

6,331

Paper
Code

Fully Convolutional Speech Recognition

no code implementations • 17 Dec 2018 • Neil Zeghidour, Qiantong Xu, Vitaliy Liptchinsky, Nicolas Usunier, Gabriel Synnaeve, Ronan Collobert

In this paper we present an alternative approach based solely on convolutional neural networks, leveraging recent advances in acoustic models from the raw waveform and language modeling.

Ranked #3 on Speech Recognition on WSJ eval93

Language Modelling speech-recognition +1

Paper
Add Code

To Reverse the Gradient or Not: An Empirical Comparison of Adversarial and Multi-task Learning in Speech Recognition

no code implementations • 9 Dec 2018 • Yossi Adi, Neil Zeghidour, Ronan Collobert, Nicolas Usunier, Vitaliy Liptchinsky, Gabriel Synnaeve

In multi-task learning, the goal is speaker prediction; we expect a performance improvement with this joint training if the two tasks of speech recognition and speaker recognition share a common set of underlying features.

Multi-Task Learning Speaker Recognition +2

Paper
Add Code

Forward Modeling for Partial Observation Strategy Games - A StarCraft Defogger

1 code implementation • ICLR 2018 • Gabriel Synnaeve, Zeming Lin, Jonas Gehring, Dan Gant, Vegard Mella, Vasil Khalidov, Nicolas Carion, Nicolas Usunier

We formulate the problem of defogging as state estimation and future state prediction from previous, partial observations in the context of real-time strategy games.

Starcraft

Paper
Code

High-Level Strategy Selection under Partial Observability in StarCraft: Brood War

no code implementations • 21 Nov 2018 • Jonas Gehring, Da Ju, Vegard Mella, Daniel Gant, Nicolas Usunier, Gabriel Synnaeve

We consider the problem of high-level strategy selection in the adversarial setting of real-time strategy games from a reinforcement learning perspective, where taking an action corresponds to switching to the respective strategy.

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Add Code

End-to-End Speech Recognition From the Raw Waveform

1 code implementation • 19 Jun 2018 • Neil Zeghidour, Nicolas Usunier, Gabriel Synnaeve, Ronan Collobert, Emmanuel Dupoux

In this paper, we study end-to-end systems trained directly from the raw waveform, building on two alternatives for trainable replacements of mel-filterbanks that use a convolutional architecture.

speech-recognition Speech Recognition

Paper
Code

Value Propagation Networks

no code implementations • ICLR 2018 • Nantas Nardelli, Gabriel Synnaeve, Zeming Lin, Pushmeet Kohli, Philip H. S. Torr, Nicolas Usunier

We present Value Propagation (VProp), a set of parameter-efficient differentiable planning modules built on Value Iteration which can successfully be trained using reinforcement learning to solve unseen tasks, has the capability to generalize to larger map sizes, and can learn to navigate in dynamic environments.

Navigate reinforcement-learning +2

Paper
Add Code

Gated ConvNets for Letter-Based ASR

no code implementations • ICLR 2018 • Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

In this paper we introduce a new speech recognition system, leveraging a simple letter-based ConvNet acoustic model.

Language Modelling speech-recognition +1

Paper
Add Code

Letter-Based Speech Recognition with Gated ConvNets

2 code implementations • 22 Dec 2017 • Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

In the recent literature, "end-to-end" speech systems often refer to letter-based acoustic models trained in a sequence-to-sequence manner, either via a recurrent model or via a structured output learning approach (such as CTC).

Ranked #46 on Speech Recognition on LibriSpeech test-clean

Language Modelling speech-recognition +1

Paper
Code

Learning Filterbanks from Raw Speech for Phone Recognition

2 code implementations • 3 Nov 2017 • Neil Zeghidour, Nicolas Usunier, Iasonas Kokkinos, Thomas Schatz, Gabriel Synnaeve, Emmanuel Dupoux

We train a bank of complex filters that operates on the raw waveform and is fed into a convolutional neural network for end-to-end phone recognition.

472

Paper
Code

STARDATA: A StarCraft AI Research Dataset

1 code implementation • 7 Aug 2017 • Zeming Lin, Jonas Gehring, Vasil Khalidov, Gabriel Synnaeve

We provide full game state data along with the original replays that can be viewed in StarCraft.

Imitation Learning Starcraft +1

562

Paper
Code

Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play

3 code implementations • ICLR 2018 • Sainbayar Sukhbaatar, Zeming Lin, Ilya Kostrikov, Gabriel Synnaeve, Arthur Szlam, Rob Fergus

When Bob is deployed on an RL task within the environment, this unsupervised training reduces the number of supervised episodes needed to learn, and in some cases converges to a higher reward.

Paper
Code

TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games

2 code implementations • 1 Nov 2016 • Gabriel Synnaeve, Nantas Nardelli, Alex Auvolat, Soumith Chintala, Timothée Lacroix, Zeming Lin, Florian Richoux, Nicolas Usunier

We present TorchCraft, a library that enables deep learning research on Real-Time Strategy (RTS) games such as StarCraft: Brood War, by making it easier to control these games from a machine learning framework, here Torch.

BIG-bench Machine Learning Starcraft

1,376

Paper
Code

Wav2Letter: an End-to-End ConvNet-based Speech Recognition System

9 code implementations • arXiv 2016 • Ronan Collobert, Christian Puhrsch, Gabriel Synnaeve

This paper presents a simple end-to-end model for speech recognition, combining a convolutional network based acoustic model and a graph decoding.

Speech Recognition

373

Paper
Code

Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks

no code implementations • 10 Sep 2016 • Nicolas Usunier, Gabriel Synnaeve, Zeming Lin, Soumith Chintala

We consider scenarios from the real-time strategy game StarCraft as new benchmarks for reinforcement learning algorithms.

Q-Learning reinforcement-learning +2

Paper
Add Code

MazeBase: A Sandbox for Learning from Games

2 code implementations • 23 Nov 2015 • Sainbayar Sukhbaatar, Arthur Szlam, Gabriel Synnaeve, Soumith Chintala, Rob Fergus

This paper introduces MazeBase: an environment for simple 2D games, designed as a sandbox for machine learning approaches to reasoning and planning.

Negation Reinforcement Learning (RL) +1

243

Paper
Code

Prosodic boundary information helps unsupervised word segmentation

no code implementations • HLT 2015 • Gabriel Synnaeve, Emmanuel Dupoux, Bogdan Ludusan

Boundary Detection Language Acquisition +2

Paper
Add Code

Weakly Supervised Multi-Embeddings Learning of Acoustic Models

no code implementations • 20 Dec 2014 • Gabriel Synnaeve, Emmanuel Dupoux

We trained a Siamese network with multi-task same/different information on a speech dataset, and found that it was possible to share a network for both tasks without a loss in performance.

Paper
Add Code

Unsupervised Word Segmentation in Context

no code implementations • COLING 2014 • Gabriel Synnaeve, Isabelle Dautriche, Benjamin B{\"o}rschinger, Mark Johnson, Emmanuel Dupoux

Language Acquisition Segmentation

Paper
Add Code

A Dataset for StarCraft AI \& an Example of Armies Clustering

1 code implementation • 19 Nov 2012 • Gabriel Synnaeve, Pierre Bessiere

We evaluated this clustering method by predicting the outcomes of battles based on armies compositions' mixtures components

Clustering Starcraft

562

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.