Search Results for author: Yongqiang Wang

Found 40 papers, 7 papers with code

Towards end-to-end spoken language understanding

1 code implementation • 23 Feb 2018 • Dmitriy Serdyuk, Yongqiang Wang, Christian Fuegen, Anuj Kumar, Baiyang Liu, Yoshua Bengio

Spoken language understanding system is traditionally designed as a pipeline of a number of components.

Natural Language Understanding Spoken Language Understanding

Paper
Code

End-to-end contextual speech recognition using class language models and a token passing decoder

no code implementations • 5 Dec 2018 • Zhehuai Chen, Mahaveer Jain, Yongqiang Wang, Michael L. Seltzer, Christian Fuegen

In this work, we focus on contextual speech recognition, which is particularly challenging for E2E models because it introduces significant mismatch between training and test data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Robust Almost Global Splay State Stabilization of Pulse Coupled Oscillators

no code implementations • 2 Aug 2019 • Francesco Ferrante, Yongqiang Wang

This technical note deals with the problem of asymptotically stabilizing the splay state configuration of a network of identical pulse coupled oscillators through the design of the their phase response function.

Paper
Add Code

Transformer-based Acoustic Modeling for Hybrid Speech Recognition

no code implementations • 22 Oct 2019 • Yongqiang Wang, Abdel-rahman Mohamed, Duc Le, Chunxi Liu, Alex Xiao, Jay Mahadeokar, Hongzhao Huang, Andros Tjandra, Xiaohui Zhang, Frank Zhang, Christian Fuegen, Geoffrey Zweig, Michael L. Seltzer

We propose and evaluate transformer-based acoustic models (AMs) for hybrid speech recognition.

Ranked #23 on Speech Recognition on LibriSpeech test-other (using extra training data)

Language Modelling speech-recognition +1

Paper
Add Code

Deja-vu: Double Feature Presentation and Iterated Loss in Deep Transformer Networks

1 code implementation • 23 Oct 2019 • Andros Tjandra, Chunxi Liu, Frank Zhang, Xiaohui Zhang, Yongqiang Wang, Gabriel Synnaeve, Satoshi Nakamura, Geoffrey Zweig

As our motivation is to allow acoustic models to re-examine their input features in light of partial hypotheses we introduce intermediate model heads and loss function.

Paper
Code

Training ASR models by Generation of Contextual Information

no code implementations • 27 Oct 2019 • Kritika Singh, Dmytro Okhonko, Jun Liu, Yongqiang Wang, Frank Zhang, Ross Girshick, Sergey Edunov, Fuchun Peng, Yatharth Saraf, Geoffrey Zweig, Abdelrahman Mohamed

Supervised ASR models have reached unprecedented levels of accuracy, thanks in part to ever-increasing amounts of labelled training data.

speech-recognition Speech Recognition +2

Paper
Add Code

Transformer-Transducer: End-to-End Speech Recognition with Self-Attention

1 code implementation • 28 Oct 2019 • Ching-Feng Yeh, Jay Mahadeokar, Kaustubh Kalgaonkar, Yongqiang Wang, Duc Le, Mahaveer Jain, Kjell Schubert, Christian Fuegen, Michael L. Seltzer

We explore options to use Transformer networks in neural transducer for end-to-end speech recognition.

speech-recognition Speech Recognition

Paper
Code

Improving N-gram Language Models with Pre-trained Deep Transformer

no code implementations • 22 Nov 2019 • Yiren Wang, Hongzhao Huang, Zhe Liu, Yutong Pang, Yongqiang Wang, ChengXiang Zhai, Fuchun Peng

Although n-gram language models (LMs) have been outperformed by the state-of-the-art neural LMs, they are still widely used in speech recognition due to its high efficiency in inference.

Data Augmentation speech-recognition +2

Paper
Add Code

Global Synchronization of Pulse-Coupled Oscillator Networks Under Byzantine Attacks

no code implementations • 7 May 2020 • Zhenqian Wang, Yongqiang Wang

Given the distributed and unattended nature of wireless sensor networks, it is imperative to enhance the resilience of PCO synchronization against malicious attacks.

Paper
Add Code

Streaming Transformer-based Acoustic Models Using Self-attention with Augmented Memory

no code implementations • 16 May 2020 • Chunyang Wu, Yongqiang Wang, Yangyang Shi, Ching-Feng Yeh, Frank Zhang

The memory bankstores the embedding information for all the processed seg-ments.

Paper
Add Code

Weak-Attention Suppression For Transformer Based Speech Recognition

no code implementations • 18 May 2020 • Yangyang Shi, Yongqiang Wang, Chunyang Wu, Christian Fuegen, Frank Zhang, Duc Le, Ching-Feng Yeh, Michael L. Seltzer

Transformers, originally proposed for natural language processing (NLP) tasks, have recently achieved great success in automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Faster, Simpler and More Accurate Hybrid ASR Systems Using Wordpieces

no code implementations • 19 May 2020 • Frank Zhang, Yongqiang Wang, Xiaohui Zhang, Chunxi Liu, Yatharth Saraf, Geoffrey Zweig

In this work, we first show that on the widely used LibriSpeech benchmark, our transformer-based context-dependent connectionist temporal classification (CTC) system produces state-of-the-art results.

Ranked #17 on Speech Recognition on LibriSpeech test-other (using extra training data)

Speech Recognition

Paper
Add Code

Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition

1 code implementation • 21 Oct 2020 • Yangyang Shi, Yongqiang Wang, Chunyang Wu, Ching-Feng Yeh, Julian Chan, Frank Zhang, Duc Le, Mike Seltzer

For a low latency scenario with an average latency of 80 ms, Emformer achieves WER $3. 01\%$ on test-clean and $7. 09\%$ on test-other.

speech-recognition Speech Recognition

Paper
Code

Transformer in action: a comparative study of transformer-based acoustic models for large scale speech recognition applications

no code implementations • 27 Oct 2020 • Yongqiang Wang, Yangyang Shi, Frank Zhang, Chunyang Wu, Julian Chan, Ching-Feng Yeh, Alex Xiao

We compare the transformer based acoustic models with their LSTM counterparts on industrial scale tasks.

speech-recognition Speech Recognition +1

Paper
Add Code

Streaming Simultaneous Speech Translation with Augmented Memory Transformer

no code implementations • 30 Oct 2020 • Xutai Ma, Yongqiang Wang, Mohammad Javad Dousti, Philipp Koehn, Juan Pino

Transformer-based models have achieved state-of-the-art performance on speech translation tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Streaming Attention-Based Models with Augmented Memory for End-to-End Speech Recognition

no code implementations • 3 Nov 2020 • Ching-Feng Yeh, Yongqiang Wang, Yangyang Shi, Chunyang Wu, Frank Zhang, Julian Chan, Michael L. Seltzer

Attention-based models have been gaining popularity recently for their strong performance demonstrated in fields such as machine translation and automatic speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Large topological Hall effect near room temperature in noncollinear ferromagnet LaMn2Ge2 single crystal

no code implementations • 11 Feb 2021 • Gaoshang Gong, Longmeng Xu, Yuming Bai, Yongqiang Wang, Songliu Yuan, Yong liu, Zhaoming Tian

Non-trivial spin structures in itinerant magnets can give rise to topological Hall effect (THE) due to the interacting local magnetic moments and conductive electrons.

Strongly Correlated Electrons

Paper
Add Code

Deep Multi-agent Reinforcement Learning for Highway On-Ramp Merging in Mixed Traffic

3 code implementations • 12 May 2021 • Dong Chen, Mohammad Hajidavalloo, Zhaojian Li, Kaian Chen, Yongqiang Wang, Longsheng Jiang, Yue Wang

On-ramp merging is a challenging task for autonomous vehicles (AVs), especially in mixed traffic where AVs coexist with human-driven vehicles (HDVs).

Autonomous Vehicles reinforcement-learning +1

2,361

Paper
Code

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition

no code implementations • 27 Sep 2021 • Yu Zhang, Daniel S. Park, Wei Han, James Qin, Anmol Gulati, Joel Shor, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li, Min Ma, William Chan, Jiahui Yu, Yongqiang Wang, Liangliang Cao, Khe Chai Sim, Bhuvana Ramabhadran, Tara N. Sainath, Françoise Beaufays, Zhifeng Chen, Quoc V. Le, Chung-Cheng Chiu, Ruoming Pang, Yonghui Wu

We summarize the results of a host of efforts using giant automatic speech recognition (ASR) models pre-trained using large, diverse unlabeled datasets containing approximately a million hours of audio.

Ranked #1 on Speech Recognition on Common Voice

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Unsupervised Data Selection via Discrete Speech Representation for ASR

no code implementations • 5 Apr 2022 • Zhiyun Lu, Yongqiang Wang, Yu Zhang, Wei Han, Zhehuai Chen, Parisa Haghani

Self-supervised learning of speech representations has achieved impressive results in improving automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Decentralized Stochastic Optimization with Inherent Privacy Protection

no code implementations • 8 May 2022 • Yongqiang Wang, H. Vincent Poor

Decentralized stochastic optimization is the basic building block of modern collaborative machine learning, distributed estimation and control, and large-scale sensing.

Stochastic Optimization

Paper
Add Code

Quantization enabled Privacy Protection in Decentralized Stochastic Optimization

no code implementations • 7 Aug 2022 • Yongqiang Wang, Tamer Basar

In combination with the presented quantization scheme, the proposed algorithm ensures, for the first time, rigorous differential privacy in decentralized stochastic optimization without losing provable convergence accuracy.

Quantization Stochastic Optimization

Paper
Add Code

Accelerating RNN-T Training and Inference Using CTC guidance

no code implementations • 29 Oct 2022 • Yongqiang Wang, Zhehuai Chen, Chengjian Zheng, Yu Zhang, Wei Han, Parisa Haghani

We propose a novel method to accelerate training and inference process of recurrent neural network transducer (RNN-T) based on the guidance from a co-trained connectionist temporal classification (CTC) model.

Paper
Add Code

Decentralized Nonconvex Optimization with Guaranteed Privacy and Accuracy

no code implementations • 14 Dec 2022 • Yongqiang Wang, Tamer Basar

The new algorithm allows the incorporation of persistent additive noise to enable rigorous differential privacy for data samples, gradients, and intermediate optimization variables without losing provable convergence, and thus circumventing the dilemma of trading accuracy for privacy in differential privacy design.

Paper
Add Code

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

no code implementations • 2 Mar 2023 • Yu Zhang, Wei Han, James Qin, Yongqiang Wang, Ankur Bapna, Zhehuai Chen, Nanxin Chen, Bo Li, Vera Axelrod, Gary Wang, Zhong Meng, Ke Hu, Andrew Rosenberg, Rohit Prabhavalkar, Daniel S. Park, Parisa Haghani, Jason Riesa, Ginger Perng, Hagen Soltau, Trevor Strohman, Bhuvana Ramabhadran, Tara Sainath, Pedro Moreno, Chung-Cheng Chiu, Johan Schalkwyk, Françoise Beaufays, Yonghui Wu

We introduce the Universal Speech Model (USM), a single large model that performs automatic speech recognition (ASR) across 100+ languages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

AudioPaLM: A Large Language Model That Can Speak and Listen

no code implementations • 22 Jun 2023 • Paul K. Rubenstein, Chulayuth Asawaroengchai, Duc Dung Nguyen, Ankur Bapna, Zalán Borsos, Félix de Chaumont Quitry, Peter Chen, Dalia El Badawy, Wei Han, Eugene Kharitonov, Hannah Muckenhirn, Dirk Padfield, James Qin, Danny Rozenberg, Tara Sainath, Johan Schalkwyk, Matt Sharifi, Michelle Tadmor, Ramanovich, Marco Tagliasacchi, Alexandru Tudor, Mihajlo Velimirović, Damien Vincent, Jiahui Yu, Yongqiang Wang, Vicky Zayats, Neil Zeghidour, Yu Zhang, Zhishuai Zhang, Lukas Zilka, Christian Frank

AudioPaLM inherits the capability to preserve paralinguistic information such as speaker identity and intonation from AudioLM and the linguistic knowledge present only in text large language models such as PaLM-2.

Language Modelling Large Language Model +5

Paper
Add Code

Locally Differentially Private Distributed Online Learning with Guaranteed Optimality

no code implementations • 25 Jun 2023 • Ziqin Chen, Yongqiang Wang

Distributed online learning is gaining increased traction due to its unique ability to process large-scale datasets and streaming data.

Image Classification

Paper
Add Code

MFMAN-YOLO: A Method for Detecting Pole-like Obstacles in Complex Environment

no code implementations • 24 Jul 2023 • Lei Cai, Hao Wang, Congling Zhou, Yongqiang Wang, Boyu Liu

To solve the problem that the feature information of pole-like obstacles in complex environments is easily lost, resulting in low detection accuracy and low real-time performance, a multi-scale hybrid attention mechanism detection algorithm is proposed in this paper.

Paper
Add Code

Communication-Efficient Decentralized Multi-Agent Reinforcement Learning for Cooperative Adaptive Cruise Control

1 code implementation • 4 Aug 2023 • Dong Chen, Kaixiang Zhang, Yongqiang Wang, Xunyuan Yin, Zhaojian Li, Dimitar Filev

Connected and autonomous vehicles (CAVs) promise next-gen transportation systems with enhanced safety, energy efficiency, and sustainability.

Autonomous Vehicles Multi-agent Reinforcement Learning +1

Paper
Code

Microvasculature Segmentation in Human BioMolecular Atlas Program (HuBMAP)

no code implementations • 6 Aug 2023 • Youssef Sultan, Yongqiang Wang, James Scanlon, Lisa D'lima

Image segmentation serves as a critical tool across a range of applications, encompassing autonomous driving's pedestrian detection and pre-operative tumor delineation in the medical sector.

Benchmarking Image Segmentation +3

Paper
Add Code

Enhanced Residual SwinV2 Transformer for Learned Image Compression

no code implementations • 23 Aug 2023 • Yongqiang Wang, Feng Liang, Haisheng Fu, Jie Liang, Haipeng Qin, Junzhe Liang

In particular, our method achieves comparable results while reducing model complexity by 56% compared to these recent methods.

Image Compression

Paper
Add Code

Fast and High-Performance Learned Image Compression With Improved Checkerboard Context Model, Deformable Residual Module, and Knowledge Distillation

no code implementations • 5 Sep 2023 • Haisheng Fu, Feng Liang, Jie Liang, Yongqiang Wang, Guohe Zhang, Jingning Han

Then we only encode non-zero channels in the encoding and decoding process, which can greatly reduce the encoding and decoding time.

Image Compression Knowledge Distillation +2

Paper
Add Code

USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models

no code implementations • 14 Sep 2023 • Guanlong Zhao, Yongqiang Wang, Jason Pelecanos, Yu Zhang, Hank Liao, Yiling Huang, Han Lu, Quan Wang

We show that the USM-SCD model can achieve more than 75% average speaker change detection F1 score across a test set that consists of data from 96 languages.

Change Detection

Paper
Add Code

SLM: Bridge the thin gap between speech and text foundation models

no code implementations • 30 Sep 2023 • Mingqiu Wang, Wei Han, Izhak Shafran, Zelin Wu, Chung-Cheng Chiu, Yuan Cao, Yongqiang Wang, Nanxin Chen, Yu Zhang, Hagen Soltau, Paul Rubenstein, Lukas Zilka, Dian Yu, Zhong Meng, Golan Pundak, Nikhil Siddhartha, Johan Schalkwyk, Yonghui Wu

We present a joint Speech and Language Model (SLM), a multitask, multilingual, and dual-modal model that takes advantage of pretrained foundational speech and language models.

Instruction Following Language Modelling +3

Paper
Add Code

Locally Differentially Private Gradient Tracking for Distributed Online Learning over Directed Graphs

no code implementations • 24 Oct 2023 • Ziqin Chen, Yongqiang Wang

To the best of our knowledge, this is the first result that simultaneously ensures learning accuracy and rigorous local differential privacy in distributed online learning over directed graphs.

Image Classification

Paper
Add Code

Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study

no code implementations • 23 Jan 2024 • W. Ronny Huang, Cyril Allauzen, Tongzhou Chen, Kilol Gupta, Ke Hu, James Qin, Yu Zhang, Yongqiang Wang, Shuo-Yiin Chang, Tara N. Sainath

In the era of large models, the autoregressive nature of decoding often results in latency serving as a significant bottleneck.

Language Modelling Large Language Model +2

Paper
Add Code

Privacy-Preserving Distributed Optimization and Learning

no code implementations • 29 Feb 2024 • Ziqin Chen, Yongqiang Wang

We first discuss cryptography, differential privacy, and other techniques that can be used for privacy preservation and indicate their pros and cons for privacy protection in distributed optimization and learning.

Distributed Optimization Privacy Preserving

Paper
Add Code

Privacy in Multi-agent Systems

no code implementations • 5 Mar 2024 • Yongqiang Wang

With the increasing awareness of privacy and the deployment of legislations in various multi-agent system application domains such as power systems and intelligent transportation, the privacy protection problem for multi-agent systems is gaining increased traction in recent years.

Paper
Add Code

Quantization Avoids Saddle Points in Distributed Optimization

no code implementations • 15 Mar 2024 • Yanan Bo, Yongqiang Wang

More specifically, we propose a stochastic quantization scheme and prove that it can effectively escape saddle points and ensure convergence to a second-order stationary point in distributed nonconvex optimization.

Distributed Optimization Quantization

Paper
Add Code

S2LIC: Learned Image Compression with the SwinV2 Block, Adaptive Channel-wise and Global-inter Attention Context

1 code implementation • 21 Mar 2024 • Yongqiang Wang, Feng Liang, Jie Liang, Haisheng Fu

In this paper, we propose an Adaptive Channel-wise and Global-inter attention Context (ACGC) entropy model, which can efficiently achieve dual feature aggregation in both inter-slice and intraslice contexts.

Image Compression MS-SSIM +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.