no code implementations • 23 Sep 2024 • Hao Che, Hong Jiang, Zhijun Wang
While preventing seller lock-in and improving efficiency and availability, a VSM suffers from a key weakness from a buyer's perspective, i. e., the broker and the corresponding marketplace lock-in, which may lead to suboptimal shopping experience for buyers, due to marketplace monopoly by the broker and limited choice of products in the marketplace.
no code implementations • 11 Aug 2024 • Chunyu Qiang, Wang Geng, Yi Zhao, Ruibo Fu, Tao Wang, Cheng Gong, Tianrui Wang, Qiuyu Liu, Jiangyan Yi, Zhengqi Wen, Chen Zhang, Hao Che, Longbiao Wang, Jianwu Dang, JianHua Tao
For tasks such as text-to-speech (TTS), voice conversion (VC), and automatic speech recognition (ASR), a cross-modal fine-grained (frame-level) sequence representation is desired, emphasizing the semantic content of the text modality while de-emphasizing the paralinguistic information of the speech modality.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+5
no code implementations • 14 Mar 2023 • Chunyu Qiang, Peng Yang, Hao Che, Ying Zhang, Xiaorui Wang, Zhongyuan Wang
Cross-speaker style transfer in speech synthesis aims at transferring a style from source speaker to synthesized speech of a target speaker's timbre.
no code implementations • 13 Dec 2022 • Chunyu Qiang, Peng Yang, Hao Che, Xiaorui Wang, Zhongyuan Wang
In order to improve the style extraction ability of the reference encoder, a style invariant and contrastive data augmentation method is proposed.
no code implementations • 17 Nov 2022 • Chunyu Qiang, Peng Yang, Hao Che, Jinba Xiao, Xiaorui Wang, Zhongyuan Wang
In this paper we propose a simple back-translation-style data augmentation method for mandarin Chinese polyphone disambiguation, utilizing a large amount of unlabeled text data.
1 code implementation • 22 Jun 2021 • Xiwen Qu, Hao Che, Jun Huang, Linchuan Xu, Xiao Zheng
To this end, this paper designs a Multi-layered Semantic Representation Network (MSRN) which discovers both local and global semantics of labels through modeling label correlations and utilizes the label semantics to guide the semantic representations learning at multiple layers through an attention mechanism.
Ranked #7 on
Multi-Label Classification
on PASCAL VOC 2007
no code implementations • 29 Oct 2019 • Xinyong Zhou, Hao Che, Xiaorui Wang, Lei Xie
In this paper, we present a cross-lingual voice cloning approach.