Search Results for author: Vitaliy Liptchinsky

Found 13 papers, 7 papers with code

Flashlight: Enabling Innovation in Tools for Machine Learning

2 code implementations • 29 Jan 2022 • Jacob Kahn, Vineel Pratap, Tatiana Likhomanenko, Qiantong Xu, Awni Hannun, Jeff Cai, Paden Tomasello, Ann Lee, Edouard Grave, Gilad Avidov, Benoit Steiner, Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

This is in part due to the difficulties involved in prototyping new computational paradigms with existing frameworks.

BIG-bench Machine Learning

5,146

Paper
Code

Self-supervised Pretraining of Visual Features in the Wild

1 code implementation • 2 Mar 2021 • Priya Goyal, Mathilde Caron, Benjamin Lefaudeux, Min Xu, Pengchao Wang, Vivek Pai, Mannat Singh, Vitaliy Liptchinsky, Ishan Misra, Armand Joulin, Piotr Bojanowski

Recently, self-supervised learning methods like MoCo, SimCLR, BYOL and SwAV have reduced the gap with supervised methods.

Ranked #6 on Image Classification on Places205

Self-Supervised Image Classification Self-Supervised Learning +1

3,229

Paper
Code

Beyond English-Centric Multilingual Machine Translation

7 code implementations • 21 Oct 2020 • Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin

Existing work in translation demonstrated the potential of massively multilingual machine translation by training a single model able to translate between any pair of languages.

Machine Translation Translation

124,889

Paper
Code

Massively Multilingual ASR: 50 Languages, 1 Model, 1 Billion Parameters

no code implementations • 6 Jul 2020 • Vineel Pratap, Anuroop Sriram, Paden Tomasello, Awni Hannun, Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

We study training a single acoustic model for multiple languages with the aim of improving automatic speech recognition (ASR) performance on low-resource languages, and over-all simplifying deployment of ASR systems that support diverse languages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Large scale weakly and semi-supervised learning for low-resource video ASR

no code implementations • 16 May 2020 • Kritika Singh, Vimal Manohar, Alex Xiao, Sergey Edunov, Ross Girshick, Vitaliy Liptchinsky, Christian Fuegen, Yatharth Saraf, Geoffrey Zweig, Abdel-rahman Mohamed

Many semi- and weakly-supervised approaches have been investigated for overcoming the labeling cost of building high quality speech recognition systems.

speech-recognition Speech Recognition

Paper
Add Code

Scaling Up Online Speech Recognition Using ConvNets

no code implementations • 27 Jan 2020 • Vineel Pratap, Qiantong Xu, Jacob Kahn, Gilad Avidov, Tatiana Likhomanenko, Awni Hannun, Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

We design an online end-to-end speech recognition system based on Time-Depth Separable (TDS) convolutions and Connectionist Temporal Classification (CTC).

speech-recognition Speech Recognition

Paper
Add Code

Libri-Light: A Benchmark for ASR with Limited or No Supervision

2 code implementations • 17 Dec 2019 • Jacob Kahn, Morgane Rivière, Weiyi Zheng, Evgeny Kharitonov, Qiantong Xu, Pierre-Emmanuel Mazaré, Julien Karadayi, Vitaliy Liptchinsky, Ronan Collobert, Christian Fuegen, Tatiana Likhomanenko, Gabriel Synnaeve, Armand Joulin, Abdel-rahman Mohamed, Emmanuel Dupoux

Additionally, we provide baseline systems and evaluation metrics working under three settings: (1) the zero resource/unsupervised setting (ABX), (2) the semi-supervised setting (PER, CER) and (3) the distant supervision setting (WER).

Ranked #1 on Speech Recognition on Libri-Light test-other (ABX-within metric)

speech-recognition Speech Recognition

446

Paper
Code

End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures

1 code implementation • 19 Nov 2019 • Gabriel Synnaeve, Qiantong Xu, Jacob Kahn, Tatiana Likhomanenko, Edouard Grave, Vineel Pratap, Anuroop Sriram, Vitaliy Liptchinsky, Ronan Collobert

We study pseudo-labeling for the semi-supervised training of ResNet, Time-Depth Separable ConvNets, and Transformers for speech recognition, with either CTC or Seq2Seq loss functions.

Ranked #16 on Speech Recognition on LibriSpeech test-other (using extra training data)

Language Modelling speech-recognition +1

5,333

Paper
Code

wav2letter++: The Fastest Open-source Speech Recognition System

8 code implementations • 18 Dec 2018 • Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert

This paper introduces wav2letter++, the fastest open-source deep learning speech recognition framework.

Speech Recognition

6,331

Paper
Code

Fully Convolutional Speech Recognition

no code implementations • 17 Dec 2018 • Neil Zeghidour, Qiantong Xu, Vitaliy Liptchinsky, Nicolas Usunier, Gabriel Synnaeve, Ronan Collobert

In this paper we present an alternative approach based solely on convolutional neural networks, leveraging recent advances in acoustic models from the raw waveform and language modeling.

Ranked #3 on Speech Recognition on WSJ eval93

Language Modelling speech-recognition +1

Paper
Add Code

To Reverse the Gradient or Not: An Empirical Comparison of Adversarial and Multi-task Learning in Speech Recognition

no code implementations • 9 Dec 2018 • Yossi Adi, Neil Zeghidour, Ronan Collobert, Nicolas Usunier, Vitaliy Liptchinsky, Gabriel Synnaeve

In multi-task learning, the goal is speaker prediction; we expect a performance improvement with this joint training if the two tasks of speech recognition and speaker recognition share a common set of underlying features.

Multi-Task Learning Speaker Recognition +2

Paper
Add Code

Gated ConvNets for Letter-Based ASR

no code implementations • ICLR 2018 • Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

In this paper we introduce a new speech recognition system, leveraging a simple letter-based ConvNet acoustic model.

Language Modelling speech-recognition +1

Paper
Add Code

Letter-Based Speech Recognition with Gated ConvNets

2 code implementations • 22 Dec 2017 • Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

In the recent literature, "end-to-end" speech systems often refer to letter-based acoustic models trained in a sequence-to-sequence manner, either via a recurrent model or via a structured output learning approach (such as CTC).

Ranked #46 on Speech Recognition on LibriSpeech test-clean

Language Modelling speech-recognition +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.