Search Results for author: Tatiana Likhomanenko

Found 29 papers, 13 papers with code

Federated Learning with Differential Privacy for End-to-End Speech Recognition

no code implementations • 29 Sep 2023 • Martin Pelikan, Sheikh Shams Azam, Vitaly Feldman, Jan "Honza" Silovsky, Kunal Talwar, Tatiana Likhomanenko

($4. 5$, $10^{-9}$)-$\textbf{DP}$) with a 1. 3% (resp.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition

no code implementations • 29 Sep 2023 • Andrew Rouditchenko, Ronan Collobert, Tatiana Likhomanenko

Audio-visual speech contains synchronized audio and visual information that provides cross-modal supervision to learn representations for both automatic speech recognition (ASR) and visual speech recognition (VSR).

Audio-Visual Speech Recognition Automatic Speech Recognition +4

Paper
Add Code

Importance of Smoothness Induced by Optimizers in FL4ASR: Towards Understanding Federated Learning for End-to-End ASR

no code implementations • 22 Sep 2023 • Sheikh Shams Azam, Tatiana Likhomanenko, Martin Pelikan, Jan "Honza" Silovsky

In this paper, we start by training End-to-End Automatic Speech Recognition (ASR) models using Federated Learning (FL) and examining the fundamental considerations that can be pivotal in minimizing the performance gap in terms of word error rate between models trained using FL versus their centralized counterpart.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

VISION Datasets: A Benchmark for Vision-based InduStrial InspectiON

no code implementations • 13 Jun 2023 • Haoping Bai, Shancong Mou, Tatiana Likhomanenko, Ramazan Gokberk Cinbis, Oncel Tuzel, Ping Huang, Jiulong Shan, Jianjun Shi, Meng Cao

We introduce the VISION Datasets, a diverse collection of 14 industrial inspection datasets, uniquely poised to meet these challenges.

Defect Detection Instance Segmentation +1

Paper
Add Code

Unsupervised ASR via Cross-Lingual Pseudo-Labeling

no code implementations • 19 May 2023 • Tatiana Likhomanenko, Loren Lugosch, Ronan Collobert

Here, "unsupervised" means no labeled audio is available for the $\textit{target}$ language.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Stabilizing Transformer Training by Preventing Attention Entropy Collapse

1 code implementation • 11 Mar 2023 • Shuangfei Zhai, Tatiana Likhomanenko, Etai Littwin, Dan Busbridge, Jason Ramapuram, Yizhe Zhang, Jiatao Gu, Josh Susskind

We show that $\sigma$Reparam provides stability and robustness with respect to the choice of hyperparameters, going so far as enabling training (a) a Vision Transformer {to competitive performance} without warmup, weight decay, layer normalization or adaptive optimizers; (b) deep architectures in machine translation and (c) speech recognition to competitive performance without warmup and adaptive optimizers.

Automatic Speech Recognition Image Classification +6

268

Paper
Code

Joint Speech Transcription and Translation: Pseudo-Labeling with Out-of-Distribution Data

no code implementations • 20 Dec 2022 • Mozhdeh Gheini, Tatiana Likhomanenko, Matthias Sperber, Hendra Setiawan

Self-training has been shown to be helpful in addressing data scarcity for many domains, including vision, speech, and language.

Data Augmentation Pseudo Label +2

Paper
Add Code

Continuous Soft Pseudo-Labeling in ASR

no code implementations • 11 Nov 2022 • Tatiana Likhomanenko, Ronan Collobert, Navdeep Jaitly, Samy Bengio

Continuous pseudo-labeling (PL) algorithms such as slimIPL have recently emerged as a powerful strategy for semi-supervised learning in speech recognition.

speech-recognition Speech Recognition

Paper
Add Code

More Speaking or More Speakers?

no code implementations • 2 Nov 2022 • Dan Berrebbi, Ronan Collobert, Navdeep Jaitly, Tatiana Likhomanenko

We perform a systematic analysis on both labeled and unlabeled data by varying the number of speakers while keeping the number of hours fixed and vice versa.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Continuous Pseudo-Labeling from the Start

no code implementations • 17 Oct 2022 • Dan Berrebbi, Ronan Collobert, Samy Bengio, Navdeep Jaitly, Tatiana Likhomanenko

Nevertheless, these approaches still rely on bootstrapping the ST using an initial supervised learning phase where the model is trained on labeled data alone.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Position Prediction as an Effective Pretraining Strategy

1 code implementation • 15 Jul 2022 • Shuangfei Zhai, Navdeep Jaitly, Jason Ramapuram, Dan Busbridge, Tatiana Likhomanenko, Joseph Yitan Cheng, Walter Talbott, Chen Huang, Hanlin Goh, Joshua Susskind

This pretraining strategy which has been used in BERT models in NLP, Wav2Vec models in Speech and, recently, in MAE models in Vision, forces the model to learn about relationships between the content in different parts of the input using autoencoding related objectives.

Position speech-recognition +1

Paper
Code

Flashlight: Enabling Innovation in Tools for Machine Learning

2 code implementations • 29 Jan 2022 • Jacob Kahn, Vineel Pratap, Tatiana Likhomanenko, Qiantong Xu, Awni Hannun, Jeff Cai, Paden Tomasello, Ann Lee, Edouard Grave, Gilad Avidov, Benoit Steiner, Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

This is in part due to the difficulties involved in prototyping new computational paradigms with existing frameworks.

BIG-bench Machine Learning

5,145

Paper
Code

Pseudo-Labeling for Massively Multilingual Speech Recognition

no code implementations • 30 Oct 2021 • Loren Lugosch, Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert

Semi-supervised learning through pseudo-labeling has become a staple of state-of-the-art monolingual speech recognition systems.

speech-recognition Speech Recognition

Paper
Add Code

Word Order Does Not Matter For Speech Recognition

no code implementations • 12 Oct 2021 • Vineel Pratap, Qiantong Xu, Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert

In this paper, we study training of automatic speech recognition system in a weakly supervised setting where the order of words in transcript labels of the audio training data is not known.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition

no code implementations • 14 Jun 2021 • Vimal Manohar, Tatiana Likhomanenko, Qiantong Xu, Wei-Ning Hsu, Ronan Collobert, Yatharth Saraf, Geoffrey Zweig, Abdelrahman Mohamed

In this paper, we introduce the Kaizen framework that uses a continuously improving teacher to generate pseudo-labels for semi-supervised speech recognition (ASR).

speech-recognition Speech Recognition

Paper
Add Code

CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings

1 code implementation • NeurIPS 2021 • Tatiana Likhomanenko, Qiantong Xu, Gabriel Synnaeve, Ronan Collobert, Alex Rogozhnikov

Absolute or relative positional embeddings are the most popular ways to feed Transformer models with positional information.

Machine Translation speech-recognition +2

Paper
Code

Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training

3 code implementations • 2 Apr 2021 • Wei-Ning Hsu, Anuroop Sriram, Alexei Baevski, Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Jacob Kahn, Ann Lee, Ronan Collobert, Gabriel Synnaeve, Michael Auli

On a large-scale competitive setup, we show that pre-training on unlabeled in-domain data reduces the gap between models trained on in-domain and out-of-domain labeled data by 66%-73%.

Self-Supervised Learning

29,201

Paper
Code

Joint Masked CPC and CTC Training for ASR

1 code implementation • 30 Oct 2020 • Chaitanya Talnikar, Tatiana Likhomanenko, Ronan Collobert, Gabriel Synnaeve

Self-supervised learning (SSL) has shown promise in learning representations of audio that are useful for automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

491

Paper
Code

SlimIPL: Language-Model-Free Iterative Pseudo-Labeling

no code implementations • 22 Oct 2020 • Tatiana Likhomanenko, Qiantong Xu, Jacob Kahn, Gabriel Synnaeve, Ronan Collobert

We improve upon the IPL algorithm: as the model learns, we propose to iteratively re-generate transcriptions with hard labels (the most probable tokens), that is, without a language model.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Rethinking Evaluation in ASR: Are Our Models Robust Enough?

1 code implementation • 22 Oct 2020 • Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Paden Tomasello, Jacob Kahn, Gilad Avidov, Ronan Collobert, Gabriel Synnaeve

Finally, we show that training a single acoustic model on the most widely-used datasets - combined - reaches competitive performance on both research and real-world benchmarks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

6,331

Paper
Code

Self-training and Pre-training are Complementary for Speech Recognition

3 code implementations • 22 Oct 2020 • Qiantong Xu, Alexei Baevski, Tatiana Likhomanenko, Paden Tomasello, Alexis Conneau, Ronan Collobert, Gabriel Synnaeve, Michael Auli

Self-training and unsupervised pre-training have emerged as effective approaches to improve speech recognition systems using unlabeled data.

Ranked #1 on Speech Recognition on LibriSpeech train-clean-100 test-other (using extra training data)

speech-recognition Speech Recognition +1

29,203

Paper
Code

Iterative Pseudo-Labeling for Speech Recognition

1 code implementation • 19 May 2020 • Qiantong Xu, Tatiana Likhomanenko, Jacob Kahn, Awni Hannun, Gabriel Synnaeve, Ronan Collobert

In particular, IPL fine-tunes an existing model at each iteration using both labeled data and a subset of unlabeled data.

Ranked #11 on Speech Recognition on LibriSpeech test-other

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

6,331

Paper
Code

Scaling Up Online Speech Recognition Using ConvNets

no code implementations • 27 Jan 2020 • Vineel Pratap, Qiantong Xu, Jacob Kahn, Gilad Avidov, Tatiana Likhomanenko, Awni Hannun, Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

We design an online end-to-end speech recognition system based on Time-Depth Separable (TDS) convolutions and Connectionist Temporal Classification (CTC).

speech-recognition Speech Recognition

Paper
Add Code

Libri-Light: A Benchmark for ASR with Limited or No Supervision

2 code implementations • 17 Dec 2019 • Jacob Kahn, Morgane Rivière, Weiyi Zheng, Evgeny Kharitonov, Qiantong Xu, Pierre-Emmanuel Mazaré, Julien Karadayi, Vitaliy Liptchinsky, Ronan Collobert, Christian Fuegen, Tatiana Likhomanenko, Gabriel Synnaeve, Armand Joulin, Abdel-rahman Mohamed, Emmanuel Dupoux

Additionally, we provide baseline systems and evaluation metrics working under three settings: (1) the zero resource/unsupervised setting (ABX), (2) the semi-supervised setting (PER, CER) and (3) the distant supervision setting (WER).

Ranked #1 on Speech Recognition on Libri-Light test-other (ABX-within metric)

speech-recognition Speech Recognition

446

Paper
Code

End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures

1 code implementation • 19 Nov 2019 • Gabriel Synnaeve, Qiantong Xu, Jacob Kahn, Tatiana Likhomanenko, Edouard Grave, Vineel Pratap, Anuroop Sriram, Vitaliy Liptchinsky, Ronan Collobert

We study pseudo-labeling for the semi-supervised training of ResNet, Time-Depth Separable ConvNets, and Transformers for speech recognition, with either CTC or Seq2Seq loss functions.

Ranked #16 on Speech Recognition on LibriSpeech test-other (using extra training data)

Language Modelling speech-recognition +1

5,333

Paper
Code

Who Needs Words? Lexicon-Free Speech Recognition

no code implementations • 9 Apr 2019 • Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert

Lexicon-free speech recognition naturally deals with the problem of out-of-vocabulary (OOV) words.

speech-recognition Speech Recognition

Paper
Add Code

InfiniteBoost: building infinite ensembles with gradient descent

1 code implementation • 4 Jun 2017 • Alex Rogozhnikov, Tatiana Likhomanenko

In machine learning ensemble methods have demonstrated high accuracy for the variety of problems in different areas.

BIG-bench Machine Learning General Classification +1

183

Paper
Code

Inclusive Flavour Tagging Algorithm

no code implementations • 24 May 2017 • Tatiana Likhomanenko, Denis Derkach, Alex Rogozhnikov

The proposed inclusive flavour-tagging algorithm is applicable to tag the flavour of $B$ mesons in any proton-proton experiment.

TAG

Paper
Add Code

Reproducible Experiment Platform

1 code implementation • 1 Oct 2015 • Tatiana Likhomanenko, Alex Rogozhnikov, Alexander Baranov, Egor Khairullin, Andrey Ustyuzhanin

Data analysis in fundamental sciences nowadays is an essential process that pushes frontiers of our knowledge and leads to new discoveries.

Data Analysis, Statistics and Probability

679

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.