Search Results for author: Vineel Pratap

Found 14 papers, 10 papers with code

TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch

1 code implementation • 27 Oct 2023 • Jeff Hwang, Moto Hira, Caroline Chen, Xiaohui Zhang, Zhaoheng Ni, Guangzhi Sun, Pingchuan Ma, Ruizhe Huang, Vineel Pratap, Yuekai Zhang, Anurag Kumar, Chin-Yun Yu, Chuang Zhu, Chunxi Liu, Jacob Kahn, Mirco Ravanelli, Peng Sun, Shinji Watanabe, Yangyang Shi, Yumeng Tao, Robin Scheibler, Samuele Cornell, Sean Kim, Stavros Petridis

TorchAudio is an open-source audio and speech processing library built for PyTorch.

Self-Supervised Learning Speech Enhancement +2

2,379

Paper
Code

Scaling Speech Technology to 1,000+ Languages

3 code implementations • arXiv 2023 • Vineel Pratap, Andros Tjandra, Bowen Shi, Paden Tomasello, Arun Babu, Sayani Kundu, Ali Elkahky, Zhaoheng Ni, Apoorv Vyas, Maryam Fazel-Zarandi, Alexei Baevski, Yossi Adi, Xiaohui Zhang, Wei-Ning Hsu, Alexis Conneau, Michael Auli

Expanding the language coverage of speech technology has the potential to improve access to information for many more people.

Automatic Speech Recognition Language Identification +4

29,176

Paper
Code

Flashlight: Enabling Innovation in Tools for Machine Learning

2 code implementations • 29 Jan 2022 • Jacob Kahn, Vineel Pratap, Tatiana Likhomanenko, Qiantong Xu, Awni Hannun, Jeff Cai, Paden Tomasello, Ann Lee, Edouard Grave, Gilad Avidov, Benoit Steiner, Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

This is in part due to the difficulties involved in prototyping new computational paradigms with existing frameworks.

BIG-bench Machine Learning

5,142

Paper
Code

Star Temporal Classification: Sequence Classification with Partially Labeled Data

1 code implementation • 28 Jan 2022 • Vineel Pratap, Awni Hannun, Gabriel Synnaeve, Ronan Collobert

These experiments show that STC can recover most of the performance of supervised baseline when up to 70% of the labels are missing.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

Word Order Does Not Matter For Speech Recognition

no code implementations • 12 Oct 2021 • Vineel Pratap, Qiantong Xu, Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert

In this paper, we study training of automatic speech recognition system in a weakly supervised setting where the order of words in transcript labels of the audio training data is not known.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Parallel Composition of Weighted Finite-State Transducers

no code implementations • 6 Oct 2021 • Shubho Sengupta, Vineel Pratap, Awni Hannun

We benchmark our parallel algorithm on the composition of random graphs and the composition of graphs commonly used in speech recognition.

speech-recognition Speech Recognition

Paper
Add Code

Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training

3 code implementations • 2 Apr 2021 • Wei-Ning Hsu, Anuroop Sriram, Alexei Baevski, Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Jacob Kahn, Ann Lee, Ronan Collobert, Gabriel Synnaeve, Michael Auli

On a large-scale competitive setup, we show that pre-training on unlabeled in-domain data reduces the gap between models trained on in-domain and out-of-domain labeled data by 66%-73%.

Self-Supervised Learning

29,174

Paper
Code

MLS: A Large-Scale Multilingual Dataset for Speech Research

1 code implementation • 7 Dec 2020 • Vineel Pratap, Qiantong Xu, Anuroop Sriram, Gabriel Synnaeve, Ronan Collobert

This paper introduces Multilingual LibriSpeech (MLS) dataset, a large multilingual corpus suitable for speech research.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

6,328

Paper
Code

Rethinking Evaluation in ASR: Are Our Models Robust Enough?

1 code implementation • 22 Oct 2020 • Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Paden Tomasello, Jacob Kahn, Gilad Avidov, Ronan Collobert, Gabriel Synnaeve

Finally, we show that training a single acoustic model on the most widely-used datasets - combined - reaches competitive performance on both research and real-world benchmarks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

6,328

Paper
Code

Differentiable Weighted Finite-State Transducers

1 code implementation • 2 Oct 2020 • Awni Hannun, Vineel Pratap, Jacob Kahn, Wei-Ning Hsu

We introduce a framework for automatic differentiation with weighted finite-state transducers (WFSTs) allowing them to be used dynamically at training time.

Handwriting Recognition speech-recognition +1

112

Paper
Code

Massively Multilingual ASR: 50 Languages, 1 Model, 1 Billion Parameters

no code implementations • 6 Jul 2020 • Vineel Pratap, Anuroop Sriram, Paden Tomasello, Awni Hannun, Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

We study training a single acoustic model for multiple languages with the aim of improving automatic speech recognition (ASR) performance on low-resource languages, and over-all simplifying deployment of ASR systems that support diverse languages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Scaling Up Online Speech Recognition Using ConvNets

no code implementations • 27 Jan 2020 • Vineel Pratap, Qiantong Xu, Jacob Kahn, Gilad Avidov, Tatiana Likhomanenko, Awni Hannun, Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

We design an online end-to-end speech recognition system based on Time-Depth Separable (TDS) convolutions and Connectionist Temporal Classification (CTC).

speech-recognition Speech Recognition

Paper
Add Code

End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures

1 code implementation • 19 Nov 2019 • Gabriel Synnaeve, Qiantong Xu, Jacob Kahn, Tatiana Likhomanenko, Edouard Grave, Vineel Pratap, Anuroop Sriram, Vitaliy Liptchinsky, Ronan Collobert

We study pseudo-labeling for the semi-supervised training of ResNet, Time-Depth Separable ConvNets, and Transformers for speech recognition, with either CTC or Seq2Seq loss functions.

Ranked #16 on Speech Recognition on LibriSpeech test-other (using extra training data)

Language Modelling speech-recognition +1

5,333

Paper
Code

wav2letter++: The Fastest Open-source Speech Recognition System

8 code implementations • 18 Dec 2018 • Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert

This paper introduces wav2letter++, the fastest open-source deep learning speech recognition framework.

Speech Recognition

6,328

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.