no code implementations • 13 Feb 2025 • Xin Wang, Héctor Delgado, Hemlata Tak, Jee-weon Jung, Hye-jin Shim, Massimiliano Todisco, Ivan Kukanov, Xuechen Liu, Md Sahidullah, Tomi Kinnunen, Nicholas Evans, Kong Aik Lee, Junichi Yamagishi, Myeonghun Jeong, Ge Zhu, Yongyi Zang, You Zhang, Soumi Maiti, Florian Lux, Nicolas Müller, Wangyou Zhang, Chengzhe Sun, Shuwei Hou, Siwei Lyu, Sébastien Le Maguer, Cheng Gong, Hanjie Guo, Liping Chen, Vishwanath Singh
The database contains attacks generated with 32 different algorithms, also crowdsourced, and optimised to varying degrees using new surrogate detection models.
no code implementations • 24 Jan 2025 • Tianrui Wang, Meng Ge, Cheng Gong, Chunyu Qiang, Haoyu Wang, Zikang Huang, Yu Jiang, Xiaobao Wang, Xie Chen, Longbiao Wang, Jianwu Dang
To address these challenges, we propose a characteristic-specific partial fine-tuning strategy, short as CSP-FT. First, we use a weighted-sum approach to analyze the contributions of different Transformer layers in a pre-trained codec language TTS model for emotion and speaker control in the generated speech.
no code implementations • 13 Jan 2025 • Ping Guo, Cheng Gong, Xi Lin, Fei Liu, Zhichao Lu, Qingfu Zhang, Zhenkun Wang
Crafting adversarial examples is crucial for evaluating and enhancing the robustness of Deep Neural Networks (DNNs), presenting a challenge equivalent to maximizing a non-differentiable 0-1 loss function.
no code implementations • 27 Sep 2024 • Haoyu Wang, Chunyu Qiang, Tianrui Wang, Cheng Gong, Qiuyu Liu, Yu Jiang, Xiaobao Wang, Chenyang Wang, Chen Zhang
Recent advancements in speech synthesis models, trained on extensive datasets, have demonstrated remarkable zero-shot capabilities.
no code implementations • 11 Aug 2024 • Chunyu Qiang, Wang Geng, Yi Zhao, Ruibo Fu, Tao Wang, Cheng Gong, Tianrui Wang, Qiuyu Liu, Jiangyan Yi, Zhengqi Wen, Chen Zhang, Hao Che, Longbiao Wang, Jianwu Dang, JianHua Tao
For tasks such as text-to-speech (TTS), voice conversion (VC), and automatic speech recognition (ASR), a cross-modal fine-grained (frame-level) sequence representation is desired, emphasizing the semantic content of the text modality while de-emphasizing the paralinguistic information of the speech modality.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
1 code implementation • 19 Jul 2024 • Cheng Gong, Yao Chen, Qiuyang Luo, Ye Lu, Tao Li, Yuzhi Zhang, Yufei Sun, Le Zhang
Experimental results on Cifar100 and ImageNet datasets exhibit that \methodname~provides up to a \textbf{50. 00\%} reduction in training time and attains up to a \textbf{6. 94\%} enhancement in accuracy when contrasted with baseline methods across diverse models and tasks.
1 code implementation • 27 Jun 2024 • Tianye Shu, Ke Shang, Cheng Gong, Yang Nan, Hisao Ishibuchi
For a control problem with multiple conflicting objectives, there exists a set of Pareto-optimal policies called the Pareto set instead of a single optimal policy.
1 code implementation • 13 Jun 2024 • Cheng Gong, Erica Cooper, Xin Wang, Chunyu Qiang, Mengzhe Geng, Dan Wells, Longbiao Wang, Jianwu Dang, Marc Tessier, Aidan Pine, Korin Richmond, Junichi Yamagishi
Self-supervised learning (SSL) representations from massively multilingual models offer a promising solution for low-resource language speech tasks.
no code implementations • 27 Apr 2024 • YuChun Wang, Cheng Gong, Jianwei Gong, Peng Jia
Then, based on human-like generated trajectories in different environments, we design a primitive-based trajectory planner that aims to mimic human trajectories and cost weight selection, generating trajectories that are consistent with the dynamics of off-road vehicles.
1 code implementation • 2 Apr 2024 • Cheng Gong, Haoshuai Zheng, Mengting Hu, Zheng Lin, Deng-Ping Fan, Yuzhi Zhang, Tao Li
Quantization is a promising method that reduces memory usage and computational intensity of Deep Neural Networks (DNNs), but it often leads to significant output error that hinder model deployment.
no code implementations • 8 Mar 2024 • Ping Guo, Cheng Gong, Xi Lin, Zhiyuan Yang, Qingfu Zhang
To address this gap, we propose a new metric termed adversarial hypervolume, assessing the robustness of deep learning models comprehensively over a range of perturbation intensities from a multi-objective optimization standpoint.
no code implementations • 7 Apr 2023 • Cheng Gong, Ye Lu, Surong Dai, Deng Qian, Chenkun Du, Tao Li
QSS introduces five quantizing schemes and defines three new schemes as a candidate set for scheme search, and then uses the differentiable neural architecture search (DNAS) algorithm to seek the layer- or model-desired scheme from the set.
no code implementations • 9 Oct 2021 • Cheng Gong, Longbiao Wang, ZhenHua Ling, Ju Zhang, Jianwu Dang
The end-to-end speech synthesis model can directly take an utterance as reference audio, and generate speech from the text with prosody and speaker characteristics similar to the reference audio.
no code implementations • 8 Sep 2021 • Cheng Gong, Ye Lu, Kunpeng Xie, Zongming Jin, Tao Li, Yanzhi Wang
We implement ESB as an accelerator and quantitatively evaluate its efficiency on FPGAs.
no code implementations • 2 Aug 2021 • Cheng Gong, Zirui Li, Xingyu Zhou, Jiachen Li, Jianwei Gong, Junhui Zhou
Omni-directional mobile robot (OMR) systems have been very popular in academia and industry for their superb maneuverability and flexibility.
no code implementations • 24 Jun 2021 • Lianzhen Wei, Zirui Li, Jianwei Gong, Cheng Gong, Jiachen Li
Due to the complex and dynamic character of intersection scenarios, the autonomous driving strategy at intersections has been a difficult problem and a hot point in the research of intelligent transportation systems in recent years.
1 code implementation • 19 Oct 2020 • Tingwei Guo, Cheng Wen, Dongwei Jiang, Ne Luo, Ruixiong Zhang, Shuaijiang Zhao, Wubo Li, Cheng Gong, Wei Zou, Kun Han, Xiangang Li
This paper introduces a new open-sourced Mandarin speech corpus, called DiDiSpeech.
Audio and Speech Processing
no code implementations • 17 Sep 2020 • Xianqi He, Zirui Li, Xufeng Yin, Jianwei Gong, Cheng Gong
In order to verify the effect of the system, this paper collects the accuracy and calculation time of the output results of the cylinder in different poses.
no code implementations • 11 Jul 2020 • Zirui Li, Chao Lu, Cheng Gong, Jinghang Li, Lianzhen Wei
Accurately modelling the driver behavior at the intersection is essential for intelligent transportation systems (ITS).
1 code implementation • 18 May 2020 • Cheng Gong, Yao Chen, Ye Lu, Tao Li, Cong Hao, Deming Chen
Quantization has been proven to be an effective method for reducing the computing and/or storage cost of DNNs.
2 code implementations • 2 Aug 2019 • Kun Han, Junwen Chen, HUI ZHANG, Haiyang Xu, Yiping Peng, Yun Wang, Ning Ding, Hui Deng, Yonghu Gao, Tingwei Guo, Yi Zhang, Yahao He, Baochang Ma, Yu-Long Zhou, Kangli Zhang, Chao Liu, Ying Lyu, Chenxi Wang, Cheng Gong, Yunbo Wang, Wei Zou, Hui Song, Xiangang Li
In this paper we present DELTA, a deep learning based language technology platform.
Ranked #3 on
Named Entity Recognition
on CoNLL 2003 (English)
2 code implementations • 26 May 2018 • Deng-Ping Fan, Cheng Gong, Yang Cao, Bo Ren, Ming-Ming Cheng, Ali Borji
The existing binary foreground map (FM) measures to address various types of errors in either pixel-wise or structural ways.