Search Results for author: Kai Zhen

Found 8 papers, 0 papers with code

Sub-8-bit quantization for on-device speech recognition: a regularization-free approach

no code implementations • 17 Oct 2022 • Kai Zhen, Martin Radfar, Hieu Duy Nguyen, Grant P. Strimel, Nathan Susanj, Athanasios Mouchtaris

For on-device automatic speech recognition (ASR), quantization aware training (QAT) is ubiquitous to achieve the trade-off between model predictive performance and efficiency.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Sub-8-Bit Quantization Aware Training for 8-Bit Neural Network Accelerator with On-Device Speech Recognition

no code implementations • 30 Jun 2022 • Kai Zhen, Hieu Duy Nguyen, Raviteja Chinta, Nathan Susanj, Athanasios Mouchtaris, Tariq Afzal, Ariya Rastrow

We present a novel sub-8-bit quantization-aware training (S8BQAT) scheme for 8-bit neural network accelerators.

Quantization speech-recognition +1

Paper
Add Code

Scalable and Efficient Neural Speech Coding: A Hybrid Design

no code implementations • 27 Mar 2021 • Kai Zhen, Jongmo Sung, Mi Suk Lee, Seungkwon Beak, Minje Kim

We formulate the speech coding problem as an autoencoding task, where a convolutional neural network (CNN) performs encoding and decoding as a neural waveform codec (NWC) during its feedforward routine.

Quantization

Paper
Add Code

Sparsification via Compressed Sensing for Automatic Speech Recognition

no code implementations • 9 Feb 2021 • Kai Zhen, Hieu Duy Nguyen, Feng-Ju Chang, Athanasios Mouchtaris, Ariya Rastrow, .

In the literature, such methods are referred to as sparse pruning.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Psychoacoustic Calibration of Loss Functions for Efficient End-to-End Neural Audio Coding

no code implementations • 31 Dec 2020 • Kai Zhen, Mi Suk Lee, Jongmo Sung, SeungKwon Beack, Minje Kim

Conventional audio coding technologies commonly leverage human perception of sound, or psychoacoustics, to reduce the bitrate while preserving the perceptual quality of the decoded audio signals.

Paper
Add Code

A Dual-Staged Context Aggregation Method Towards Efficient End-To-End Speech Enhancement

no code implementations • 18 Aug 2019 • Kai Zhen, Mi Suk Lee, Minje Kim

In speech enhancement, an end-to-end deep neural network converts a noisy speech signal to a clean speech directly in time domain without time-frequency transformation or mask estimation.

Speech Enhancement

Paper
Add Code

Cascaded Cross-Module Residual Learning towards Lightweight End-to-End Speech Coding

no code implementations • 18 Jun 2019 • Kai Zhen, Jongmo Sung, Mi Suk Lee, Seung-Kwon Beack, Minje Kim

Speech codecs learn compact representations of speech signals to facilitate data transmission.

Paper
Add Code

A Hybrid Supervised-unsupervised Method on Image Topic Visualization with Convolutional Neural Network and LDA

no code implementations • 15 Mar 2017 • Kai Zhen, Mridul Birla, David Crandall, Bingjing Zhang, Judy Qiu

Given the progress in image recognition with recent data driven paradigms, it's still expensive to manually label a large training data to fit a convolutional neural network (CNN) model.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.