Search Results for author: Vahid Noroozi

Found 15 papers, 4 papers with code

NVIDIA NeMo Offline Speech Translation Systems for IWSLT 2022

no code implementations • IWSLT (ACL) 2022 • Oleksii Hrinchuk, Vahid Noroozi, Ashwinkumar Ganesan, Sarah Campbell, Sandeep Subramanian, Somshubra Majumdar, Oleksii Kuchaiev

Our cascade system consists of 1) Conformer RNN-T automatic speech recognition model, 2) punctuation-capitalization model based on pre-trained T5 encoder, 3) ensemble of Transformer neural machine translation models fine-tuned on TED talks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition

1 code implementation • 27 Dec 2023 • Vahid Noroozi, Somshubra Majumdar, Ankur Kumar, Jagadeesh Balam, Boris Ginsburg

We also showed that training a model with multiple latencies can achieve better accuracy than single latency models while it enables us to support multiple latencies with a single model.

Automatic Speech Recognition speech-recognition +1

10,034

Paper
Code

Investigating End-to-End ASR Architectures for Long Form Audio Transcription

no code implementations • 18 Sep 2023 • Nithin Rao Koluguri, Samuel Kriman, Georgy Zelenfroind, Somshubra Majumdar, Dima Rekesh, Vahid Noroozi, Jagadeesh Balam, Boris Ginsburg

This paper presents an overview and evaluation of some of the end-to-end ASR models on long-form audios.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition

no code implementations • 8 May 2023 • Dima Rekesh, Nithin Rao Koluguri, Samuel Kriman, Somshubra Majumdar, Vahid Noroozi, He Huang, Oleksii Hrinchuk, Krishna Puvvada, Ankur Kumar, Jagadeesh Balam, Boris Ginsburg

Conformer-based models have become the dominant end-to-end architecture for speech processing tasks.

Ranked #1 on Speech Recognition on LibriSpeech test-other

Automatic Speech Recognition speech-recognition +3

Paper
Add Code

SGD-QA: Fast Schema-Guided Dialogue State Tracking for Unseen Services

no code implementations • 17 May 2021 • Yang Zhang, Vahid Noroozi, Evelina Bakhturina, Boris Ginsburg

In this paper, we propose SGD-QA, a simple and extensible model for schema-guided dialogue state tracking based on a question answering approach.

Dialogue State Tracking Goal-Oriented Dialogue Systems +1

Paper
Add Code

Citrinet: Closing the Gap between Non-Autoregressive and Autoregressive End-to-End Models for Automatic Speech Recognition

no code implementations • 5 Apr 2021 • Somshubra Majumdar, Jagadeesh Balam, Oleksii Hrinchuk, Vitaly Lavrukhin, Vahid Noroozi, Boris Ginsburg

We propose Citrinet - a new end-to-end convolutional Connectionist Temporal Classification (CTC) based automatic speech recognition (ASR) model.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition

1 code implementation • 5 Apr 2021 • Patrick K. O'Neill, Vitaly Lavrukhin, Somshubra Majumdar, Vahid Noroozi, Yuekai Zhang, Oleksii Kuchaiev, Jagadeesh Balam, Yuliya Dovzhenko, Keenan Freyberg, Michael D. Shulman, Boris Ginsburg, Shinji Watanabe, Georg Kucsko

In the English speech-to-text (STT) machine learning task, acoustic models are conventionally trained on uncased Latin characters, and any necessary orthography (such as capitalization, punctuation, and denormalization of non-standard words) is imputed by separate post-processing models.

Ranked #3 on Speech Recognition on SPGISpeech

speech-recognition Speech Recognition

7,867

Paper
Code

I-ODA, Real-World Multi-modal Longitudinal Data for OphthalmicApplications

no code implementations • 30 Mar 2021 • Nooshin Mojab, Vahid Noroozi, Abdullah Aleem, Manoj P. Nallabothula, Joseph Baker, Dimitri T. Azar, Mark Rosenblatt, RV Paul Chan, Darvin Yi, Philip S. Yu, Joelle A. Hallak

In this paper, we present a new multi-modal longitudinal ophthalmic imaging dataset, the Illinois Ophthalmic Database Atlas (I-ODA), with the goal of advancing state-of-the-art computer vision applications in ophthalmology, and improving upon the translatable capacity of AI based applications across different clinical settings.

Paper
Add Code

A Fast and Robust BERT-based Dialogue State Tracker for Schema-Guided Dialogue Dataset

1 code implementation • 27 Aug 2020 • Vahid Noroozi, Yang Zhang, Evelina Bakhturina, Tomasz Kornuta

Dialog State Tracking (DST) is one of the most crucial modules for goal-oriented dialogue systems.

Data Augmentation dialog state tracking +2

10,034

Paper
Code

Real-World Multi-Domain Data Applications for Generalizations to Clinical Settings

no code implementations • 24 Jul 2020 • Nooshin Mojab, Vahid Noroozi, Darvin Yi, Manoj Prabhakar Nallabothula, Abdullah Aleem, Phillip S. Yu, Joelle A. Hallak

However, real-world data is different and translations are yielding varying results.

Representation Learning Transfer Learning

Paper
Add Code

Leveraging Semi-Supervised Learning for Fairness using Neural Networks

no code implementations • 31 Dec 2019 • Vahid Noroozi, Sara Bahaadini, Samira Sheikhi, Nooshin Mojab, Philip S. Yu

There has been a growing concern about the fairness of decision-making systems based on machine learning.

BIG-bench Machine Learning Decision Making +1

Paper
Add Code

Semi-supervised Deep Representation Learning for Multi-View Problems

no code implementations • 11 Nov 2018 • Vahid Noroozi, Sara Bahaadini, Lei Zheng, Sihong Xie, Weixiang Shao, Philip S. Yu

While neural networks for learning representation of multi-view data have been previously proposed as one of the state-of-the-art multi-view dimension reduction techniques, how to make the representation discriminative with only a small amount of labeled data is not well-studied.

Dimensionality Reduction Learning Representation Of Multi-View Data

Paper
Add Code

DIRECT: Deep Discriminative Embedding for Clustering of LIGO Data

no code implementations • 7 May 2018 • Sara Bahaadini, Vahid Noroozi, Neda Rohani, Scott Coughlin, Michael Zevin, Aggelos K. Katsaggelos

In this paper, benefiting from the strong ability of deep neural network in estimating non-linear functions, we propose a discriminative embedding function to be used as a feature extractor for clustering tasks.

Clustering

Paper
Add Code

SEVEN: Deep Semi-supervised Verification Networks

no code implementations • 12 Jun 2017 • Vahid Noroozi, Lei Zheng, Sara Bahaadini, Sihong Xie, Philip S. Yu

The model consists of two complementary components.

Paper
Add Code

Joint Deep Modeling of Users and Items Using Reviews for Recommendation

5 code implementations • 17 Jan 2017 • Lei Zheng, Vahid Noroozi, Philip S. Yu

One of the networks focuses on learning user behaviors exploiting reviews written by the user, and the other one learns item properties from the reviews written for the item.

Recommendation Systems

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.