Search Results for author: Cheng Gong

Found 22 papers, 8 papers with code

Characteristic-Specific Partial Fine-Tuning for Efficient Emotion and Speaker Adaptation in Codec Language Text-to-Speech Models

no code implementations24 Jan 2025 Tianrui Wang, Meng Ge, Cheng Gong, Chunyu Qiang, Haoyu Wang, Zikang Huang, Yu Jiang, Xiaobao Wang, Xie Chen, Longbiao Wang, Jianwu Dang

To address these challenges, we propose a characteristic-specific partial fine-tuning strategy, short as CSP-FT. First, we use a weighted-sum approach to analyze the contributions of different Transformer layers in a pre-trained codec language TTS model for emotion and speaker control in the generated speech.

Emotion Classification Speaker Identification +1

MOS-Attack: A Scalable Multi-objective Adversarial Attack Framework

no code implementations13 Jan 2025 Ping Guo, Cheng Gong, Xi Lin, Fei Liu, Zhichao Lu, Qingfu Zhang, Zhenkun Wang

Crafting adversarial examples is crucial for evaluating and enhancing the robustness of Deep Neural Networks (DNNs), presenting a challenge equivalent to maximizing a non-differentiable 0-1 loss function.

Adversarial Attack

EmoPro: A Prompt Selection Strategy for Emotional Expression in LM-based Speech Synthesis

no code implementations27 Sep 2024 Haoyu Wang, Chunyu Qiang, Tianrui Wang, Cheng Gong, Qiuyu Liu, Yu Jiang, Xiaobao Wang, Chenyang Wang, Chen Zhang

Recent advancements in speech synthesis models, trained on extensive datasets, have demonstrated remarkable zero-shot capabilities.

Speech Synthesis

VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing

no code implementations11 Aug 2024 Chunyu Qiang, Wang Geng, Yi Zhao, Ruibo Fu, Tao Wang, Cheng Gong, Tianrui Wang, Qiuyu Liu, Jiangyan Yi, Zhengqi Wen, Chen Zhang, Hao Che, Longbiao Wang, Jianwu Dang, JianHua Tao

For tasks such as text-to-speech (TTS), voice conversion (VC), and automatic speech recognition (ASR), a cross-modal fine-grained (frame-level) sequence representation is desired, emphasizing the semantic content of the text modality while de-emphasizing the paralinguistic information of the speech modality.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Deep Feature Surgery: Towards Accurate and Efficient Multi-Exit Networks

1 code implementation19 Jul 2024 Cheng Gong, Yao Chen, Qiuyang Luo, Ye Lu, Tao Li, Yuzhi Zhang, Yufei Sun, Le Zhang

Experimental results on Cifar100 and ImageNet datasets exhibit that \methodname~provides up to a \textbf{50. 00\%} reduction in training time and attains up to a \textbf{6. 94\%} enhancement in accuracy when contrasted with baseline methods across diverse models and tasks.

Learning Pareto Set for Multi-Objective Continuous Robot Control

1 code implementation27 Jun 2024 Tianye Shu, Ke Shang, Cheng Gong, Yang Nan, Hisao Ishibuchi

For a control problem with multiple conflicting objectives, there exists a set of Pareto-optimal policies called the Pareto set instead of a single optimal policy.

Multi-Objective Reinforcement Learning

Motion planning for off-road autonomous driving based on human-like cognition and weight adaptation

no code implementations27 Apr 2024 YuChun Wang, Cheng Gong, Jianwei Gong, Peng Jia

Then, based on human-like generated trajectories in different environments, we design a primitive-based trajectory planner that aims to mimic human trajectories and cost weight selection, generating trajectories that are consistent with the dynamics of off-road vehicles.

Autonomous Driving Motion Planning

Minimize Quantization Output Error with Bias Compensation

1 code implementation2 Apr 2024 Cheng Gong, Haoshuai Zheng, Mengting Hu, Zheng Lin, Deng-Ping Fan, Yuzhi Zhang, Tao Li

Quantization is a promising method that reduces memory usage and computational intensity of Deep Neural Networks (DNNs), but it often leads to significant output error that hinder model deployment.

Quantization

Exploring the Adversarial Frontier: Quantifying Robustness via Adversarial Hypervolume

no code implementations8 Mar 2024 Ping Guo, Cheng Gong, Xi Lin, Zhiyuan Yang, Qingfu Zhang

To address this gap, we propose a new metric termed adversarial hypervolume, assessing the robustness of deep learning models comprehensively over a range of perturbation intensities from a multi-objective optimization standpoint.

Adversarial Robustness Benchmarking +1

AutoQNN: An End-to-End Framework for Automatically Quantizing Neural Networks

no code implementations7 Apr 2023 Cheng Gong, Ye Lu, Surong Dai, Deng Qian, Chenkun Du, Tao Li

QSS introduces five quantizing schemes and defines three new schemes as a candidate set for scheme search, and then uses the differentiable neural architecture search (DNAS) algorithm to seek the layer- or model-desired scheme from the set.

Neural Architecture Search Quantization

Using multiple reference audios and style embedding constraints for speech synthesis

no code implementations9 Oct 2021 Cheng Gong, Longbiao Wang, ZhenHua Ling, Ju Zhang, Jianwu Dang

The end-to-end speech synthesis model can directly take an utterance as reference audio, and generate speech from the text with prosody and speaker characteristics similar to the reference audio.

Sentence Sentence Similarity +1

Orientation-Aware Planning for Parallel Task Execution of Omni-Directional Mobile Robot

no code implementations2 Aug 2021 Cheng Gong, Zirui Li, Xingyu Zhou, Jiachen Li, Jianwei Gong, Junhui Zhou

Omni-directional mobile robot (OMR) systems have been very popular in academia and industry for their superb maneuverability and flexibility.

Position

Autonomous Driving Strategies at Intersections: Scenarios, State-of-the-Art, and Future Outlooks

no code implementations24 Jun 2021 Lianzhen Wei, Zirui Li, Jianwei Gong, Cheng Gong, Jiachen Li

Due to the complex and dynamic character of intersection scenarios, the autonomous driving strategy at intersections has been a difficult problem and a hot point in the research of intelligent transportation systems in recent years.

Autonomous Driving

DiDiSpeech: A Large Scale Mandarin Speech Corpus

1 code implementation19 Oct 2020 Tingwei Guo, Cheng Wen, Dongwei Jiang, Ne Luo, Ruixiong Zhang, Shuaijiang Zhao, Wubo Li, Cheng Gong, Wei Zou, Kun Han, Xiangang Li

This paper introduces a new open-sourced Mandarin speech corpus, called DiDiSpeech.

Audio and Speech Processing

High-precision target positioning system for unmanned vehicles based on binocular vision

no code implementations17 Sep 2020 Xianqi He, Zirui Li, Xufeng Yin, Jianwei Gong, Cheng Gong

In order to verify the effect of the system, this paper collects the accuracy and calculation time of the output results of the cylinder in different poses.

Pose Estimation Position +1

Driver Behavior Modelling at the Urban Intersection via Canonical Correlation Analysis

no code implementations11 Jul 2020 Zirui Li, Chao Lu, Cheng Gong, Jinghang Li, Lianzhen Wei

Accurately modelling the driver behavior at the intersection is essential for intelligent transportation systems (ITS).

feature selection regression

VecQ: Minimal Loss DNN Model Compression With Vectorized Weight Quantization

1 code implementation18 May 2020 Cheng Gong, Yao Chen, Ye Lu, Tao Li, Cong Hao, Deming Chen

Quantization has been proven to be an effective method for reducing the computing and/or storage cost of DNNs.

Model Compression object-detection +2

Enhanced-alignment Measure for Binary Foreground Map Evaluation

2 code implementations26 May 2018 Deng-Ping Fan, Cheng Gong, Yang Cao, Bo Ren, Ming-Ming Cheng, Ali Borji

The existing binary foreground map (FM) measures to address various types of errors in either pixel-wise or structural ways.

Cannot find the paper you are looking for? You can Submit a new open access paper.