Search Results for author: Oleksii Kuchaiev

Found 16 papers, 9 papers with code

NVIDIA NeMo Offline Speech Translation Systems for IWSLT 2022

no code implementations IWSLT (ACL) 2022 Oleksii Hrinchuk, Vahid Noroozi, Ashwinkumar Ganesan, Sarah Campbell, Sandeep Subramanian, Somshubra Majumdar, Oleksii Kuchaiev

Our cascade system consists of 1) Conformer RNN-T automatic speech recognition model, 2) punctuation-capitalization model based on pre-trained T5 encoder, 3) ensemble of Transformer neural machine translation models fine-tuned on TED talks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

NVIDIA NeMo Neural Machine Translation Systems for English-German and English-Russian News and Biomedical Tasks at WMT21

no code implementations16 Nov 2021 Sandeep Subramanian, Oleksii Hrinchuk, Virginia Adams, Oleksii Kuchaiev

This paper provides an overview of NVIDIA NeMo's neural machine translation systems for the constrained data track of the WMT21 News and Biomedical Shared Translation Tasks.

Data Augmentation Knowledge Distillation +3

SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition

1 code implementation5 Apr 2021 Patrick K. O'Neill, Vitaly Lavrukhin, Somshubra Majumdar, Vahid Noroozi, Yuekai Zhang, Oleksii Kuchaiev, Jagadeesh Balam, Yuliya Dovzhenko, Keenan Freyberg, Michael D. Shulman, Boris Ginsburg, Shinji Watanabe, Georg Kucsko

In the English speech-to-text (STT) machine learning task, acoustic models are conventionally trained on uncased Latin characters, and any necessary orthography (such as capitalization, punctuation, and denormalization of non-standard words) is imputed by separate post-processing models.

speech-recognition Speech Recognition

Jasper: An End-to-End Convolutional Neural Acoustic Model

8 code implementations5 Apr 2019 Jason Li, Vitaly Lavrukhin, Boris Ginsburg, Ryan Leary, Oleksii Kuchaiev, Jonathan M. Cohen, Huyen Nguyen, Ravi Teja Gadde

In this paper, we report state-of-the-art results on LibriSpeech among end-to-end speech recognition models without any external training data.

Language Modelling Speech Recognition

Training Deep AutoEncoders for Recommender Systems

no code implementations ICLR 2018 Oleksii Kuchaiev, Boris Ginsburg

Our model is based on deep autoencoder with 6 layers and is trained end-to-end without any layer-wise pre-training.

Recommendation Systems

Training Deep AutoEncoders for Collaborative Filtering

9 code implementations5 Aug 2017 Oleksii Kuchaiev, Boris Ginsburg

Our model is based on deep autoencoder with 6 layers and is trained end-to-end without any layer-wise pre-training.

Collaborative Filtering Recommendation Systems

Factorization tricks for LSTM networks

2 code implementations31 Mar 2017 Oleksii Kuchaiev, Boris Ginsburg

We present two simple ways of reducing the number of parameters and accelerating the training of large Long Short-Term Memory (LSTM) networks: the first one is "matrix factorization by design" of LSTM matrix into the product of two smaller matrices, and the second one is partitioning of LSTM matrix, its inputs and states into the independent groups.

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.