Search Results for author: Somshubra Majumdar

Found 17 papers, 8 papers with code

NVIDIA NeMo Offline Speech Translation Systems for IWSLT 2022

no code implementations IWSLT (ACL) 2022 Oleksii Hrinchuk, Vahid Noroozi, Ashwinkumar Ganesan, Sarah Campbell, Sandeep Subramanian, Somshubra Majumdar, Oleksii Kuchaiev

Our cascade system consists of 1) Conformer RNN-T automatic speech recognition model, 2) punctuation-capitalization model based on pre-trained T5 encoder, 3) ensemble of Transformer neural machine translation models fine-tuned on TED talks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition

1 code implementation27 Dec 2023 Vahid Noroozi, Somshubra Majumdar, Ankur Kumar, Jagadeesh Balam, Boris Ginsburg

We also showed that training a model with multiple latencies can achieve better accuracy than single latency models while it enables us to support multiple latencies with a single model.

Automatic Speech Recognition speech-recognition +1

CTC Variations Through New WFST Topologies

no code implementations6 Oct 2021 Aleksandr Laptev, Somshubra Majumdar, Boris Ginsburg

This paper presents novel Weighted Finite-State Transducer (WFST) topologies to implement Connectionist Temporal Classification (CTC)-like algorithms for automatic speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition

1 code implementation5 Apr 2021 Patrick K. O'Neill, Vitaly Lavrukhin, Somshubra Majumdar, Vahid Noroozi, Yuekai Zhang, Oleksii Kuchaiev, Jagadeesh Balam, Yuliya Dovzhenko, Keenan Freyberg, Michael D. Shulman, Boris Ginsburg, Shinji Watanabe, Georg Kucsko

In the English speech-to-text (STT) machine learning task, acoustic models are conventionally trained on uncased Latin characters, and any necessary orthography (such as capitalization, punctuation, and denormalization of non-standard words) is imputed by separate post-processing models.

speech-recognition Speech Recognition

Adversarial Attacks on Time Series

2 code implementations27 Feb 2019 Fazle Karim, Somshubra Majumdar, Houshang Darabi

In this paper, we propose utilizing an adversarial transformation network (ATN) on a distilled model to attack various time series classification models.

Classification Dynamic Time Warping +4

Insights into LSTM Fully Convolutional Networks for Time Series Classification

4 code implementations27 Feb 2019 Fazle Karim, Somshubra Majumdar, Houshang Darabi

In this paper, we perform a series of ablation tests (3627 experiments) on LSTM-FCN and ALSTM-FCN to provide a better understanding of the model and each of its sub-module.

Classification General Classification +3

LSTM Fully Convolutional Networks for Time Series Classification

9 code implementations8 Sep 2017 Fazle Karim, Somshubra Majumdar, Houshang Darabi, Shun Chen

We propose the augmentation of fully convolutional networks with long short term memory recurrent neural network (LSTM RNN) sub-modules for time series classification.

General Classification Outlier Detection +3

Cannot find the paper you are looking for? You can Submit a new open access paper.