1 code implementation • 26 Nov 2024 • Yifan Yang, Jianheng Zhuo, Zengrui Jin, Ziyang Ma, Xiaoyu Yang, Zengwei Yao, Liyong Guo, Wei Kang, Fangjun Kuang, Long Lin, Daniel Povey, Xie Chen
Self-supervised learning (SSL) has achieved great success in speech-related tasks.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
1 code implementation • 7 Oct 2024 • Zengwei Yao, Wei Kang, Xiaoyu Yang, Fangjun Kuang, Liyong Guo, Han Zhu, Zengrui Jin, Zhaoqing Li, Long Lin, Daniel Povey
Connectionist Temporal Classification (CTC) is a widely used method for automatic speech recognition (ASR), renowned for its simplicity and computational efficiency.
Ranked #1 on
Speech Recognition
on GigaSpeech TEST
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 1 Sep 2024 • Zengrui Jin, Yifan Yang, Mohan Shi, Wei Kang, Xiaoyu Yang, Zengwei Yao, Fangjun Kuang, Liyong Guo, Lingwei Meng, Long Lin, Yong Xu, Shi-Xiong Zhang, Daniel Povey
This paper presents a large-scale far-field overlapping speech dataset, crafted to advance research in speech separation, recognition, and speaker diarization.
no code implementations • 8 Jul 2024 • Detian Chu, Linyuan Bai, Jianuo Huang, Zhenlong Fang, Peng Zhang, Wei Kang, Haifeng Lin
With the advancement of autonomous driving, ensuring safety during motion planning and navigation is becoming more and more important.
no code implementations • 21 Mar 2024 • Maoxuan Zhou, Wei Kang, Kun He
Firstly, Gram angular field coding technique is used to encode the time domain signal of the rolling bearing and generate the feature map to retain the complete information of the vibration signal.
1 code implementation • 17 Oct 2023 • Zengwei Yao, Liyong Guo, Xiaoyu Yang, Wei Kang, Fangjun Kuang, Yifan Yang, Zengrui Jin, Long Lin, Daniel Povey
The Conformer has become the most popular encoder model for automatic speech recognition (ASR).
Ranked #3 on
Speech Recognition
on WenetSpeech
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
2 code implementations • 15 Sep 2023 • Wei Kang, Xiaoyu Yang, Zengwei Yao, Fangjun Kuang, Yifan Yang, Liyong Guo, Long Lin, Daniel Povey
In this paper, we introduce Libriheavy, a large-scale ASR corpus consisting of 50, 000 hours of read English speech derived from LibriVox.
2 code implementations • 14 Sep 2023 • Xiaoyu Yang, Wei Kang, Zengwei Yao, Yifan Yang, Liyong Guo, Fangjun Kuang, Long Lin, Daniel Povey
An additional style prompt can be given to the text encoder and guide the ASR system to output different styles of transcriptions.
no code implementations • 14 Jul 2023 • Wei Kang, Liang Xu, Hong Zhou
We propose a novel learning-based surrogate data assimilation (DA) model for efficient state estimation in a limited area.
1 code implementation • 19 May 2023 • Yifan Yang, Xiaoyu Yang, Liyong Guo, Zengwei Yao, Wei Kang, Fangjun Kuang, Long Lin, Xie Chen, Daniel Povey
Neural Transducer and connectionist temporal classification (CTC) are popular end-to-end automatic speech recognition systems.
1 code implementation • 19 May 2023 • Zengwei Yao, Wei Kang, Fangjun Kuang, Liyong Guo, Xiaoyu Yang, Yifan Yang, Long Lin, Daniel Povey
Our work is open-sourced and publicly available https://github. com/k2-fsa/k2.
no code implementations • 1 Nov 2022 • Wei Kang, Daniel M. Tartakovsky, Apoorv Srivastava
We introduce a mathematical formulation of feature-informed data assimilation (FIDA).
1 code implementation • 31 Oct 2022 • Wei Kang, Liyong Guo, Fangjun Kuang, Long Lin, Mingshuang Luo, Zengwei Yao, Xiaoyu Yang, Piotr Żelasko, Daniel Povey
In this work, we introduce a constrained version of transducer loss to learn strictly monotonic alignments between the sequences; we also improve the standard greedy search and beam search algorithms by limiting the number of symbols that can be emitted per time step in transducer decoding, making it more efficient to decode in parallel with batches.
1 code implementation • 31 Oct 2022 • Liyong Guo, Xiaoyu Yang, Quandong Wang, Yuxiang Kong, Zengwei Yao, Fan Cui, Fangjun Kuang, Wei Kang, Long Lin, Mingshuang Luo, Piotr Zelasko, Daniel Povey
Although on-the-fly teacher label generation tackles this issue, the training speed is significantly slower as the teacher model has to be evaluated every batch.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
1 code implementation • 31 Oct 2022 • Wei Kang, Zengwei Yao, Fangjun Kuang, Liyong Guo, Xiaoyu Yang, Long Lin, Piotr Żelasko, Daniel Povey
In streaming automatic speech recognition (ASR), it is desirable to reduce latency as much as possible while having minimum impact on recognition accuracy.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 23 Jun 2022 • Fangjun Kuang, Liyong Guo, Wei Kang, Long Lin, Mingshuang Luo, Zengwei Yao, Daniel Povey
The RNN-Transducer (RNN-T) framework for speech recognition has been growing in popularity, particularly for deployed real-time ASR systems, because it combines high accuracy with naturally streaming recognition.
1 code implementation • 1 May 2022 • Tenavi Nakamura-Zimmerer, Qi Gong, Wei Kang
The proposed architectures are compared against standard neural network feedback controllers through numerical simulations of two high-dimensional nonlinear optimal control problems: stabilization of an unstable Burgers-type partial differential equation, and altitude and course tracking for an unmanned aerial vehicle.
no code implementations • 13 Apr 2022 • Huming Qiu, Hua Ma, Zhi Zhang, Alsharif Abuadbba, Wei Kang, Anmin Fu, Yansong Gao
Since Deep Learning (DL) backdoor attacks have been revealed as one of the most insidious adversarial attacks, a number of countermeasures have been developed with certain assumptions defined in their respective threat models.
no code implementations • 9 Mar 2022 • Tianwei Xia, Kai Sun, Wei Kang
A case study is carried out for a microgrid model based on a modified Kundur two-area system to test the real-time performance of the proposed control scheme.
no code implementations • 11 Jan 2022 • Wei Kang, Liang Xu, Hong Zhou
In this paper, we introduce the concept of observability of targeted state variables for systems that may not be fully observable.
no code implementations • 29 Dec 2021 • Wei Kang, Kai Sun, Liang Xu
We prove that a neural network approximation exists for the Lyapunov function of power systems such that the approximation error is a cubic polynomial of the number of generators.
no code implementations • 15 Sep 2021 • Tenavi Nakamura-Zimmerer, Qi Gong, Wei Kang
In this paper we use numerical simulations to demonstrate that typical test accuracy metrics do not effectively capture the ability of an NN controller to stabilize a system.
1 code implementation • 15 Dec 2020 • Maisie Badami, Marcos Baez, Shayan Zamanirad, Wei Kang
Systematic literature reviews (SLRs) are at the heart of evidence-based research, setting the foundation for future research and practice.
no code implementations • 3 Dec 2020 • Wei Kang, Qi Gong
This capability is critical because it enables the analysis of approximation errors for problems for which analytic solutions are not available, such as differential equations and optimal control.
no code implementations • 11 Sep 2020 • Tenavi Nakamura-Zimmerer, Qi Gong, Wei Kang
In this paper we propose a new computational method for designing optimal regulators for high-dimensional nonlinear systems.
BIG-bench Machine Learning
Physics-informed machine learning
no code implementations • 21 Nov 2019 • Tenavi Nakamura-Zimmerer, Daniele Venturi, Qi Gong, Wei Kang
Uncertainty propagation in nonlinear dynamic systems remains an outstanding problem in scientific computing and control.
1 code implementation • 11 Jul 2019 • Tenavi Nakamura-Zimmerer, Qi Gong, Wei Kang
In this paper, we propose a data-driven method to approximate semi-global solutions to HJB equations for general high-dimensional nonlinear systems and compute candidate optimal feedback controls in real-time.