Search Results for author: Chong Deng

Found 20 papers, 11 papers with code

CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models

1 code implementation13 Dec 2024 Zhihao Du, Yuxuan Wang, Qian Chen, Xian Shi, Xiang Lv, Tianyu Zhao, Zhifu Gao, Yexin Yang, Changfeng Gao, Hui Wang, Fan Yu, Huadai Liu, Zhengyan Sheng, Yue Gu, Chong Deng, Wen Wang, Shiliang Zhang, Zhijie Yan, Jingren Zhou

By training on a large-scale multilingual dataset, CosyVoice 2 achieves human-parity naturalness, minimal response latency, and virtually lossless synthesis quality in the streaming mode.

In-Context Learning Quantization +1

OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation

1 code implementation23 Oct 2024 Qinglin Zhang, Luyao Cheng, Chong Deng, Qian Chen, Wen Wang, Siqi Zheng, Jiaqing Liu, Hai Yu, Chaohong Tan, Zhihao Du, Shiliang Zhang

However, achieving low latency and natural interactions in full-duplex dialogue systems remains a significant challenge, especially considering human conversation dynamics such as interruptions, backchannels, and overlapping speech.

Large Language Model Spoken Dialogue Systems

Recording for Eyes, Not Echoing to Ears: Contextualized Spoken-to-Written Conversion of ASR Transcripts

no code implementations19 Aug 2024 Jiaqing Liu, Chong Deng, Qinglin Zhang, Shilin Zhou, Qian Chen, Hai Yu, Wen Wang

To improve readability, we propose a Contextualized Spoken-to-Written conversion (CoS2W) task to address ASR and grammar errors and also transfer the informal text into the formal style with content preserved, utilizing contexts and auxiliary information.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Multimodal Fusion and Coherence Modeling for Video Topic Segmentation

no code implementations1 Aug 2024 Hai Yu, Chong Deng, Qinglin Zhang, Jiaqing Liu, Qian Chen, Wen Wang

In this work, we improve supervised VTS by thoroughly exploring multimodal fusion and multimodal coherence modeling.

Contrastive Learning Scene Segmentation +2

Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers

no code implementations17 Jun 2024 Qian Chen, Wen Wang, Qinglin Zhang, Siqi Zheng, Shiliang Zhang, Chong Deng, Hai Yu, Jiaqing Liu, Yukun Ma, Chong Zhang

The Transformer architecture has significantly advanced deep learning, particularly in natural language processing, by effectively managing long-range dependencies.

Diversity Language Modeling +1

Loss Masking Is Not Needed in Decoder-only Transformer for Discrete-token-based ASR

1 code implementation8 Nov 2023 Qian Chen, Wen Wang, Qinglin Zhang, Siqi Zheng, Shiliang Zhang, Chong Deng, Yukun Ma, Hai Yu, Jiaqing Liu, Chong Zhang

We find that applying the conventional cross-entropy loss on input speech tokens does not consistently improve the ASR performance over the Loss Masking approach.

Decoder

Improving Long Document Topic Segmentation Models With Enhanced Coherence Modeling

1 code implementation18 Oct 2023 Hai Yu, Chong Deng, Qinglin Zhang, Jiaqing Liu, Qian Chen, Wen Wang

Our approach improve $F_1$ of old SOTA by 3. 42 (73. 74 -> 77. 16) and reduces $P_k$ by 1. 11 points (15. 0 -> 13. 89) on WIKI-727K and achieves an average relative reduction of 4. 3% on $P_k$ on WikiSection.

Information Retrieval Segmentation +3

Improving BERT with Hybrid Pooling Network and Drop Mask

no code implementations14 Jul 2023 Qian Chen, Wen Wang, Qinglin Zhang, Chong Deng, Ma Yukun, Siqi Zheng

Transformer-based pre-trained language models, such as BERT, achieve great success in various natural language understanding tasks.

Language Modeling Language Modelling +3

Ditto: A Simple and Efficient Approach to Improve Sentence Embeddings

1 code implementation18 May 2023 Qian Chen, Wen Wang, Qinglin Zhang, Siqi Zheng, Chong Deng, Hai Yu, Jiaqing Liu, Yukun Ma, Chong Zhang

Prior studies diagnose the anisotropy problem in sentence representations from pre-trained language models, e. g., BERT, without fine-tuning.

Language Modeling Language Modelling +5

Meeting Action Item Detection with Regularized Context Modeling

no code implementations27 Mar 2023 Jiaqing Liu, Chong Deng, Qinglin Zhang, Qian Chen, Wen Wang

We construct and release the first Chinese meeting corpus with manual action item annotations.

Contrastive Learning

Overview of the ICASSP 2023 General Meeting Understanding and Generation Challenge (MUG)

no code implementations24 Mar 2023 Qinglin Zhang, Chong Deng, Jiaqing Liu, Hai Yu, Qian Chen, Wen Wang, Zhijie Yan, Jinglin Liu, Yi Ren, Zhou Zhao

ICASSP2023 General Meeting Understanding and Generation Challenge (MUG) focuses on prompting a wide range of spoken language processing (SLP) research on meeting transcripts, as SLP applications are critical to improve users' efficiency in grasping important information in meetings.

Extractive Summarization Keyphrase Extraction

MUG: A General Meeting Understanding and Generation Benchmark

1 code implementation24 Mar 2023 Qinglin Zhang, Chong Deng, Jiaqing Liu, Hai Yu, Qian Chen, Wen Wang, Zhijie Yan, Jinglin Liu, Yi Ren, Zhou Zhao

To prompt SLP advancement, we establish a large-scale general Meeting Understanding and Generation Benchmark (MUG) to benchmark the performance of a wide range of SLP tasks, including topic segmentation, topic-level and session-level extractive summarization and topic title generation, keyphrase extraction, and action item detection.

Extractive Summarization Keyphrase Extraction +1

Weighted Sampling for Masked Language Modeling

no code implementations28 Feb 2023 Linhan Zhang, Qian Chen, Wen Wang, Chong Deng, Xin Cao, Kongzhang Hao, Yuxin Jiang, Wei Wang

Experiments on the Semantic Textual Similarity benchmark (STS) show that WSBERT significantly improves sentence embeddings over BERT.

Language Modeling Language Modelling +6

MDERank: A Masked Document Embedding Rank Approach for Unsupervised Keyphrase Extraction

1 code implementation Findings (ACL) 2022 Linhan Zhang, Qian Chen, Wen Wang, Chong Deng, Shiliang Zhang, Bing Li, Wei Wang, Xin Cao

In this work, we propose a novel unsupervised embedding-based KPE approach, Masked Document Embedding Rank (MDERank), to address this problem by leveraging a mask strategy and ranking candidates by the similarity between embeddings of the source document and the masked document.

Contrastive Learning Document Embedding +4

LCQMC:A Large-scale Chinese Question Matching Corpus

no code implementations COLING 2018 Xin Liu, Qingcai Chen, Chong Deng, Huajun Zeng, Jing Chen, Dongfang Li, Buzhou Tang

In this paper, we first use a search engine to collect large-scale question pairs related to high-frequency words from various domains, then filter irrelevant pairs by the Wasserstein distance, and finally recruit three annotators to manually check the left pairs.

Information Retrieval Machine Translation +3

Cannot find the paper you are looking for? You can Submit a new open access paper.