Search Results for author: Yuping Wang

Found 24 papers, 3 papers with code

MA-CDMR: An Intelligent Cross-domain Multicast Routing Method based on Multiagent Deep Reinforcement Learning in Multi-domain SDWN

no code implementations27 Aug 2024 Miao Ye, Hongwen Hu, Xiaoli Wang, Yuping Wang, Yong Wang, Wen Peng, Jihao Zheng

An agent is established for each controller, and a cooperation mechanism between multiple agents is designed to effectively optimize cross-domain multicast routing and ensure consistency and validity in the representation of network state information for cross-domain multicast routing decisions.

Deep Reinforcement Learning

StreamVoice+: Evolving into End-to-end Streaming Zero-shot Voice Conversion

no code implementations5 Aug 2024 Zhichao Wang, Yuanzhe Chen, Xinsheng Wang, Lei Xie, Yuping Wang

StreamVoice+ integrates a semantic encoder and a connector with the original StreamVoice framework, now trained using a non-streaming ASR.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Open Challenges and Opportunities in Federated Foundation Models Towards Biomedical Healthcare

no code implementations10 May 2024 Xingyu Li, Lu Peng, Yuping Wang, Weihua Zhang

This survey explores the transformative impact of foundation models (FMs) in artificial intelligence, focusing on their integration with federated learning (FL) for advancing biomedical research.

Federated Learning Self-Supervised Learning +1

VoiceShop: A Unified Speech-to-Speech Framework for Identity-Preserving Zero-Shot Voice Editing

no code implementations10 Apr 2024 Philip Anastassiou, Zhenyu Tang, Kainan Peng, Dongya Jia, Jiaxin Li, Ming Tu, Yuping Wang, Yuxuan Wang, Mingbo Ma

We present VoiceShop, a novel speech-to-speech framework that can modify multiple attributes of speech, such as age, gender, accent, and speech style, in a single forward pass while preserving the input speaker's timbre.

Attribute

CMP: Cooperative Motion Prediction with Multi-Agent Communication

no code implementations26 Mar 2024 Zehao Wang, Yuping Wang, Zhuoyuan Wu, Hengbo Ma, Zhaowei Li, Hang Qiu, Jiachen Li

Through extensive experiments and ablation studies on the OPV2V and V2V4Real datasets, we demonstrate the effectiveness of our method in cooperative perception, tracking, and motion prediction.

Autonomous Vehicles motion prediction

StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion

no code implementations19 Jan 2024 Zhichao Wang, Yuanzhe Chen, Xinsheng Wang, Lei Xie, Yuping Wang

Specifically, to enable streaming capability, StreamVoice employs a fully causal context-aware LM with a temporal-independent acoustic predictor, while alternately processing semantic and acoustic features at each time step of autoregression which eliminates the dependence on complete source speech.

Language Modelling Voice Conversion

Towards the Unification of Generative and Discriminative Visual Foundation Model: A Survey

no code implementations15 Dec 2023 Xu Liu, Tong Zhou, Yuanxin Wang, Yuping Wang, Qinjingwen Cao, Weizhi Du, Yonghuan Yang, Junjun He, Yu Qiao, Yiqing Shen

The advent of foundation models, which are pre-trained on vast datasets, has ushered in a new era of computer vision, characterized by their robustness and remarkable zero-shot generalization capabilities.

Image Generation Image Segmentation +2

Novel View Synthesis from a Single RGBD Image for Indoor Scenes

no code implementations2 Nov 2023 Congrui Hetang, Yuping Wang

In this paper, we propose an approach for synthesizing novel view images from a single RGBD (Red Green Blue-Depth) input.

Novel View Synthesis Style Transfer

EqDrive: Efficient Equivariant Motion Forecasting with Multi-Modality for Autonomous Driving

no code implementations26 Oct 2023 Yuping Wang, Jier Chen

Forecasting vehicular motions in autonomous driving requires a deep understanding of agent interactions and the preservation of motion equivariance under Euclidean geometric transformations.

Motion Forecasting

Equivariant Map and Agent Geometry for Autonomous Driving Motion Prediction

no code implementations21 Oct 2023 Yuping Wang, Jier Chen

This research introduces a groundbreaking solution by employing EqMotion, a theoretically geometric equivariant and interaction invariant motion prediction model for particles and humans, plus integrating agent-equivariant high-definition (HD) map features for context aware motion prediction in autonomous driving.

Autonomous Driving motion prediction

MSM-VC: High-fidelity Source Style Transfer for Non-Parallel Voice Conversion by Multi-scale Style Modeling

no code implementations3 Sep 2023 Zhichao Wang, Xinsheng Wang, Qicong Xie, Tao Li, Lei Xie, Qiao Tian, Yuping Wang

In addition to conveying the linguistic content from source speech to converted speech, maintaining the speaking style of source speech also plays an important role in the voice conversion (VC) task, which is essential in many scenarios with highly expressive source speech, such as dubbing and data augmentation.

Data Augmentation Disentanglement +3

LM-VC: Zero-shot Voice Conversion via Speech Generation based on Language Models

no code implementations18 Jun 2023 Zhichao Wang, Yuanzhe Chen, Lei Xie, Qiao Tian, Yuping Wang

An intuitive approach is to follow AudioLM - Tokenizing speech into semantic and acoustic tokens respectively by HuBERT and SoundStream, and converting source semantic tokens to target acoustic tokens conditioned on acoustic tokens of the target speaker.

Audio Generation Disentanglement +2

Multi-level Temporal-channel Speaker Retrieval for Zero-shot Voice Conversion

no code implementations12 May 2023 Zhichao Wang, Liumeng Xue, Qiuqiang Kong, Lei Xie, Yuanzhe Chen, Qiao Tian, Yuping Wang

Specifically, to flexibly adapt to the dynamic-variant speaker characteristic in the temporal and channel axis of the speech, we propose a novel fine-grained speaker modeling method, called temporal-channel retrieval (TCR), to find out when and where speaker information appears in speech.

Disentanglement Retrieval +2

Zero-Shot Accent Conversion using Pseudo Siamese Disentanglement Network

no code implementations12 Dec 2022 Dongya Jia, Qiao Tian, Kainan Peng, Jiaxin Li, Yuanzhe Chen, Mingbo Ma, Yuping Wang, Yuxuan Wang

The goal of accent conversion (AC) is to convert the accent of speech into the target accent while preserving the content and speaker identity.

Data Augmentation Disentanglement

Delivering Speaking Style in Low-resource Voice Conversion with Multi-factor Constraints

no code implementations16 Nov 2022 Zhichao Wang, Xinsheng Wang, Lei Xie, Yuanzhe Chen, Qiao Tian, Yuping Wang

Conveying the linguistic content and maintaining the source speech's speaking style, such as intonation and emotion, is essential in voice conversion (VC).

Voice Conversion

Neural Dubber: Dubbing for Videos According to Scripts

no code implementations NeurIPS 2021 Chenxu Hu, Qiao Tian, Tingle Li, Yuping Wang, Yuxuan Wang, Hang Zhao

Neural Dubber is a multi-modal text-to-speech (TTS) model that utilizes the lip movement in the video to control the prosody of the generated speech.

Text to Speech

Joint framework with deep feature distillation and adaptive focal loss for weakly supervised audio tagging and acoustic event detection

no code implementations23 Mar 2021 Yunhao Liang, Yanhua Long, Yijie Li, Jiaen Liang, Yuping Wang

A good joint training framework is very helpful to improve the performances of weakly supervised audio tagging (AT) and acoustic event detection (AED) simultaneously.

Audio Tagging Event Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.