Search Results for author: Fuchun Peng

Found 16 papers, 0 papers with code

Mitigating Unintended Memorization in Language Models via Alternating Teaching

no code implementations • 13 Oct 2022 • Zhe Liu, Xuedong Zhang, Fuchun Peng

Recent research has shown that language models have a tendency to memorize rare or unique sequences in the training corpora which can thus leak sensitive attributes of user data.

Memorization Privacy Preserving

Paper
Add Code

Group Personalized Federated Learning

no code implementations • 4 Oct 2022 • Zhe Liu, Yue Hui, Fuchun Peng

Federated learning (FL) can help promote data privacy by training a shared model in a de-centralized manner on the physical devices of clients.

Personalized Federated Learning

Paper
Add Code

Modeling Dependent Structure for Utterances in ASR Evaluation

no code implementations • 7 Sep 2022 • Zhe Liu, Fuchun Peng

In this paper, we present graphical lasso based methods to explicitly model such dependency and estimate uncorrelated blocks of utterances in a rigorous way, after which blockwise bootstrap is applied on top of the inferred blocks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Neural-FST Class Language Model for End-to-End Speech Recognition

no code implementations • 28 Jan 2022 • Antoine Bruguier, Duc Le, Rohit Prabhavalkar, Dangna Li, Zhe Liu, Bo wang, Eun Chang, Fuchun Peng, Ozlem Kalinli, Michael L. Seltzer

We propose Neural-FST Class Language Model (NFCLM) for end-to-end speech recognition, a novel method that combines neural network language models (NNLMs) and finite state transducers (FSTs) in a mathematically consistent framework.

Language Modelling speech-recognition +1

Paper
Add Code

Private Language Model Adaptation for Speech Recognition

no code implementations • 28 Sep 2021 • Zhe Liu, Ke Li, Shreyan Bakshi, Fuchun Peng

Speech model adaptation is crucial to handle the discrepancy between server-side proxy training data and actual data received on local devices of users.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Model-Based Approach for Measuring the Fairness in ASR

no code implementations • 19 Sep 2021 • Zhe Liu, Irina-Elena Veliche, Fuchun Peng

The issue of fairness arises when the automatic speech recognition (ASR) systems do not perform equally well for all subgroups of the population.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models

no code implementations • 9 Jul 2021 • Xiaohui Zhang, Vimal Manohar, David Zhang, Frank Zhang, Yangyang Shi, Nayan Singhal, Julian Chan, Fuchun Peng, Yatharth Saraf, Mike Seltzer

Hybrid automatic speech recognition (ASR) models are typically sequentially trained with CTC or LF-MMI criteria.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Analyzing the Forgetting Problem in Pretrain-Finetuning of Open-domain Dialogue Response Models

no code implementations • EACL 2021 • Tianxing He, Jun Liu, Kyunghyun Cho, Myle Ott, Bing Liu, James Glass, Fuchun Peng

We find that mix-review effectively regularizes the finetuning process, and the forgetting problem is alleviated to some extent.

Response Generation Text Generation +1

Paper
Add Code

Federated Marginal Personalization for ASR Rescoring

no code implementations • 1 Dec 2020 • Zhe Liu, Fuchun Peng

Our presented approach can overcome the limitations of federated fine-tuning and efficiently learn personalized NNLMs on devices.

Federated Learning speech-recognition +1

Paper
Add Code

Benchmarking LF-MMI, CTC and RNN-T Criteria for Streaming ASR

no code implementations • 9 Nov 2020 • Xiaohui Zhang, Frank Zhang, Chunxi Liu, Kjell Schubert, Julian Chan, Pradyot Prakash, Jun Liu, Ching-Feng Yeh, Fuchun Peng, Yatharth Saraf, Geoffrey Zweig

In this work, to measure the accuracy and efficiency for a latency-controlled streaming automatic speech recognition (ASR) application, we perform comprehensive evaluations on three popular training criteria: LF-MMI, CTC and RNN-T.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Statistical Testing on ASR Performance via Blockwise Bootstrap

no code implementations • 19 Dec 2019 • Zhe Liu, Fuchun Peng

A common question being raised in automatic speech recognition (ASR) evaluations is how reliable is an observed word error rate (WER) improvement comparing two ASR systems, where statistical hypothesis testing and confidence interval (CI) can be utilized to tell whether this improvement is real or only due to random chance.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Improving N-gram Language Models with Pre-trained Deep Transformer

no code implementations • 22 Nov 2019 • Yiren Wang, Hongzhao Huang, Zhe Liu, Yutong Pang, Yongqiang Wang, ChengXiang Zhai, Fuchun Peng

Although n-gram language models (LMs) have been outperformed by the state-of-the-art neural LMs, they are still widely used in speech recognition due to its high efficiency in inference.

Data Augmentation speech-recognition +2

Paper
Add Code

Training ASR models by Generation of Contextual Information

no code implementations • 27 Oct 2019 • Kritika Singh, Dmytro Okhonko, Jun Liu, Yongqiang Wang, Frank Zhang, Ross Girshick, Sergey Edunov, Fuchun Peng, Yatharth Saraf, Geoffrey Zweig, Abdelrahman Mohamed

Supervised ASR models have reached unprecedented levels of accuracy, thanks in part to ever-increasing amounts of labelled training data.

speech-recognition Speech Recognition +2

Paper
Add Code

An Empirical Study of Efficient ASR Rescoring with Transformers

no code implementations • 24 Oct 2019 • Hongzhao Huang, Fuchun Peng

In particular, our experiments on a video speech recognition dataset show that we are able to achieve WERRs ranging from 6. 46% to 7. 17% while only with 5. 5% to 11. 9% parameter sizes of the well-known large GPT model [1], whose WERR with rescoring on the same dataset is 7. 58%.

Knowledge Distillation Language Modelling +2

Paper
Add Code

Efficient Dynamic WFST Decoding for Personalized Language Models

no code implementations • 23 Oct 2019 • Jun Liu, Jiedan Zhu, Vishal Kathuria, Fuchun Peng

A second layer is a private cache that caches the graph that represents the personalized language model, which is only shared by the utterances from a particular user.

Language Modelling speech-recognition +1

Paper
Add Code

Analyzing the Forgetting Problem in the Pretrain-Finetuning of Dialogue Response Models

no code implementations • 16 Oct 2019 • Tianxing He, Jun Liu, Kyunghyun Cho, Myle Ott, Bing Liu, James Glass, Fuchun Peng

We find that mix-review effectively regularizes the finetuning process, and the forgetting problem is alleviated to some extent.

Response Generation Text Generation +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.