Search Results for author: Jiatong Zhou

Found 4 papers, 0 papers with code

Noisy Training Improves E2E ASR for the Edge

no code implementations9 Jul 2021 Dilin Wang, Yuan Shangguan, Haichuan Yang, Pierce Chuang, Jiatong Zhou, Meng Li, Ganesh Venkatesh, Ozlem Kalinli, Vikas Chandra

We apply noisy training to improve both dense and sparse state-of-the-art Emformer models and observe consistent WER reduction.

Data Augmentation Speech Recognition

Dissecting User-Perceived Latency of On-Device E2E Speech Recognition

no code implementations6 Apr 2021 Yuan Shangguan, Rohit Prabhavalkar, Hang Su, Jay Mahadeokar, Yangyang Shi, Jiatong Zhou, Chunyang Wu, Duc Le, Ozlem Kalinli, Christian Fuegen, Michael L. Seltzer

As speech-enabled devices such as smartphones and smart speakers become increasingly ubiquitous, there is growing interest in building automatic speech recognition (ASR) systems that can run directly on-device; end-to-end (E2E) speech recognition models such as recurrent neural network transducers and their variants have recently emerged as prime candidates for this task.

Speech Recognition

A Multi-View Approach To Audio-Visual Speaker Verification

no code implementations11 Feb 2021 Leda Sari, Kritika Singh, Jiatong Zhou, Lorenzo Torresani, Nayan Singhal, Yatharth Saraf

Although speaker verification has conventionally been an audio-only task, some practical applications provide both audio and visual streams of input.

Speaker Verification

Stacked Latent Attention for Multimodal Reasoning

no code implementations CVPR 2018 Haoqi Fan, Jiatong Zhou

Attention has shown to be a pivotal development in deep learning and has been used for a multitude of multimodal learning tasks such as visual question answering and image captioning.

Image Captioning Question Answering +1

Cannot find the paper you are looking for? You can Submit a new open access paper.