Search Results for author: Keyu An

Found 13 papers, 6 papers with code

Sequential Deformation for Accurate Scene Text Detection

no code implementations • ECCV 2020 • Shanyu Xiao, Liangrui Peng, Ruijie Yan, Keyu An, Gang Yao, Jaesik Min

Scene text detection has been significantly advanced over recent years, especially after the emergence of deep neural network.

Scene Text Detection Text Detection

Paper
Add Code

Exploring RWKV for Memory Efficient and Low Latency Streaming ASR

no code implementations • 26 Sep 2023 • Keyu An, Shiliang Zhang

Recently, self-attention-based transformers and conformers have been introduced as alternatives to RNNs for ASR acoustic modeling.

Chunking

Paper
Add Code

BAT: Boundary aware transducer for memory-efficient and low-latency ASR

1 code implementation • 19 May 2023 • Keyu An, Xian Shi, Shiliang Zhang

Recently, recurrent neural network transducer (RNN-T) gains increasing popularity due to its natural streaming capability as well as superior performance.

Ranked #9 on Speech Recognition on AISHELL-1

Automatic Speech Recognition Automatic Speech Recognition (ASR)

3,378

Paper
Code

CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR

1 code implementation • 31 Mar 2022 • Keyu An, Huahuan Zheng, Zhijian Ou, Hongyu Xiang, Ke Ding, Guanglu Wan

The simulation module is jointly trained with the ASR model using a self-supervised loss; the ASR model is optimized with the usual ASR loss, e. g., CTC-CRF as used in our experiments.

Chunking speech-recognition +1

307

Paper
Code

Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study

no code implementations • 31 Mar 2022 • Keyu An, Ji Xiao, Zhijian Ou

In this paper, we systematically compare the performance of three schemes to exploit external single-channel data for multi-channel end-to-end ASR, namely back-end pre-training, data scheduling, and data simulation, under different settings such as the sizes of the single-channel data and the choices of the front-end.

Scheduling speech-recognition +1

Paper
Add Code

An Empirical Study of Language Model Integration for Transducer based Speech Recognition

no code implementations • 31 Mar 2022 • Huahuan Zheng, Keyu An, Zhijian Ou, Chen Huang, Ke Ding, Guanglu Wan

Based on the DR method, we propose a low-order density ratio method (LODR) by replacing the estimation with a low-order weak language model.

Language Modelling speech-recognition +1

Paper
Add Code

Multilingual and crosslingual speech recognition using phonological-vector based phone embeddings

1 code implementation • 11 Jul 2021 • Chengrui Zhu, Keyu An, Huahuan Zheng, Zhijian Ou

The use of phonological features (PFs) potentially allows language-specific phones to remain linked in training, which is highly desirable for information sharing for multilingual and crosslingual speech recognition methods for low-resourced languages.

speech-recognition Speech Recognition

307

Paper
Code

Exploiting Single-Channel Speech For Multi-channel End-to-end Speech Recognition

no code implementations • 6 Jul 2021 • Keyu An, Zhijian Ou

Recently, the end-to-end training approach for neural beamformer-supported multi-channel ASR has shown its effectiveness in multi-channel speech recognition.

Data Augmentation Scheduling +2

Paper
Add Code

Deformable TDNN with adaptive receptive fields for speech recognition

no code implementations • 30 Apr 2021 • Keyu An, Yi Zhang, Zhijian Ou

Time Delay Neural Networks (TDNNs) are widely used in both DNN-HMM based hybrid speech recognition systems and recent end-to-end systems.

speech-recognition Speech Recognition

Paper
Add Code

The SLT 2021 children speech recognition challenge: Open datasets, rules and baselines

no code implementations • 13 Nov 2020 • Fan Yu, Zhuoyuan Yao, Xiong Wang, Keyu An, Lei Xie, Zhijian Ou, Bo Liu, Xiulin Li, Guanqiong Miao

Automatic speech recognition (ASR) has been significantly advanced with the use of deep learning and big data.

Sound Audio and Speech Processing

Paper
Add Code

Efficient Neural Architecture Search for End-to-end Speech Recognition via Straight-Through Gradients

1 code implementation • 11 Nov 2020 • Huahuan Zheng, Keyu An, Zhijian Ou

Using ST gradients to support sub-graph sampling is a core element to achieve efficient NAS beyond DARTS and SNAS.

Ranked #1 on Speech Recognition on WSJ dev93

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency

1 code implementation • 27 May 2020 • Keyu An, Hongyu Xiang, Zhijian Ou

In this paper, we present a new open source toolkit for speech recognition, named CAT (CTC-CRF based ASR Toolkit).

Ranked #1 on Speech Recognition on Hub5'00 FISHER-SWBD

speech-recognition Speech Recognition

307

Paper
Code

CAT: CRF-based ASR Toolkit

2 code implementations • 20 Nov 2019 • Keyu An, Hongyu Xiang, Zhijian Ou

In this paper, we present a new open source toolkit for automatic speech recognition (ASR), named CAT (CRF-based ASR Toolkit).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

307

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.