Search Results for author: Guanglu Wan

Found 18 papers, 6 papers with code

Learning Speaker Embedding with Momentum Contrast

1 code implementation7 Jan 2020 Ke Ding, Xuanji He, Guanglu Wan

Momentum Contrast (MoCo) is a recently proposed unsupervised representation learning framework, and has shown its effectiveness for learning good feature representation for downstream vision tasks.

Representation Learning Speaker Verification

Multi-speaker Multi-style Text-to-speech Synthesis With Single-speaker Single-style Training Data Scenarios

no code implementations23 Dec 2021 Qicong Xie, Tao Li, Xinsheng Wang, Zhichao Wang, Lei Xie, Guoqiao Yu, Guanglu Wan

Moreover, the explicit prosody features used in the prosody predicting module can increase the diversity of synthetic speech by adjusting the value of prosody features.

Speech Synthesis Style Transfer +1

Confidence Calibration for Intent Detection via Hyperspherical Space and Rebalanced Accuracy-Uncertainty Loss

no code implementations17 Mar 2022 Yantao Gong, Cao Liu, Fan Yang, Xunliang Cai, Guanglu Wan, Jiansong Chen, Weipeng Zhang, Houfeng Wang

Experiments on the open datasets verify that our model outperforms the existing calibration methods and achieves a significant improvement on the calibration metric.

Intent Detection

CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR

1 code implementation31 Mar 2022 Keyu An, Huahuan Zheng, Zhijian Ou, Hongyu Xiang, Ke Ding, Guanglu Wan

The simulation module is jointly trained with the ASR model using a self-supervised loss; the ASR model is optimized with the usual ASR loss, e. g., CTC-CRF as used in our experiments.

Chunking speech-recognition +1

An Empirical Study of Language Model Integration for Transducer based Speech Recognition

no code implementations31 Mar 2022 Huahuan Zheng, Keyu An, Zhijian Ou, Chen Huang, Ke Ding, Guanglu Wan

Based on the DR method, we propose a low-order density ratio method (LODR) by replacing the estimation with a low-order weak language model.

Language Modelling speech-recognition +1

A Low-Cost, Controllable and Interpretable Task-Oriented Chatbot: With Real-World After-Sale Services as Example

no code implementations13 May 2022 Xiangyu Xi, Chenxu Lv, Yuncheng Hua, Wei Ye, Chaobo Sun, Shuaipeng Liu, Fan Yang, Guanglu Wan

Though widely used in industry, traditional task-oriented dialogue systems suffer from three bottlenecks: (i) difficult ontology construction (e. g., intents and slots); (ii) poor controllability and interpretability; (iii) annotation-hungry.

Chatbot Task-Oriented Dialogue Systems

Peak-First CTC: Reducing the Peak Latency of CTC Models by Applying Peak-First Regularization

no code implementations7 Nov 2022 Zhengkun Tian, Hongyu Xiang, Min Li, Feifei Lin, Ke Ding, Guanglu Wan

To reduce the peak latency, we propose a simple and novel method named peak-first regularization, which utilizes a frame-wise knowledge distillation function to force the probability distribution of the CTC model to shift left along the time axis instead of directly modifying the calculation process of CTC loss and gradients.

Knowledge Distillation

MUSIED: A Benchmark for Event Detection from Multi-Source Heterogeneous Informal Texts

1 code implementation25 Nov 2022 Xiangyu Xi, Jianwei Lv, Shuaipeng Liu, Wei Ye, Fan Yang, Guanglu Wan

As a pioneering exploration that expands event detection to the scenarios involving informal and heterogeneous texts, we propose a new large-scale Chinese event detection dataset based on user reviews, text conversations, and phone conversations in a leading e-commerce platform for food service.

Event Detection

Exploiting Pseudo Future Contexts for Emotion Recognition in Conversations

1 code implementation27 Jun 2023 Yinyi Wei, Shuaipeng Liu, Hailei Yan, Wei Ye, Tong Mo, Guanglu Wan

Specifically, for an utterance, we generate its future context with pre-trained language models, potentially containing extra beneficial knowledge in a conversational form homogeneous with the historical ones.

Emotion Recognition

CPPF: A contextual and post-processing-free model for automatic speech recognition

no code implementations14 Sep 2023 Lei Zhang, Zhengkun Tian, Xiang Chen, Jiaming Sun, Hongyu Xiang, Ke Ding, Guanglu Wan

To address this issue, we draw inspiration from the multifaceted capabilities of LLMs and Whisper, and focus on integrating multiple ASR text processing tasks related to speech recognition into the ASR model.

Automatic Speech Recognition speech-recognition +1

Enhancing Multilingual Speech Recognition through Language Prompt Tuning and Frame-Level Language Adapter

no code implementations18 Sep 2023 Song Li, Yongbin You, Xuezhi Wang, Ke Ding, Guanglu Wan

To further expand the applications of multilingual artificial intelligence assistants and facilitate international communication, it is essential to enhance the performance of multilingual speech recognition, which is a crucial component of speech interaction.

speech-recognition Speech Recognition

A Task-oriented Dialog Model with Task-progressive and Policy-aware Pre-training

1 code implementation1 Oct 2023 Lucen Zhong, Hengtong Lu, Caixia Yuan, Xiaojie Wang, Jiashen Sun, Ke Zeng, Guanglu Wan

A global policy consistency task is designed to capture the multi-turn dialog policy sequential relation, and an act-based contrastive learning task is designed to capture similarities among samples with the same dialog policy.

Contrastive Learning

One-Shot Sensitivity-Aware Mixed Sparsity Pruning for Large Language Models

no code implementations14 Oct 2023 Hang Shao, Bei Liu, Bo Xiao, Ke Zeng, Guanglu Wan, Yanmin Qian

Various Large Language Models(LLMs) from the Generative Pretrained Transformer(GPT) family have achieved outstanding performances in a wide range of text generation tasks.

Quantization Text Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.