Automatic Speech Disentanglement for Voice Conversion using Rank Module and Speech Augmentation

no code implementations21 Jun 2023 Zhonghua Liu, Shijun Wang, Ning Chen

In this paper, we propose a VC model that can automatically disentangle speech into four components using only two augmentation functions, without the requirement of multiple hand-crafted features or laborious bottleneck tuning.

Learning Emotional Representations from Imbalanced Speech Data for Speech Emotion Recognition and Emotional Text-to-Speech

no code implementations9 Jun 2023 Shijun Wang, Jón Guðnason, Damian Borth

Effective speech emotional representations play a key role in Speech Emotion Recognition (SER) and Emotional Text-To-Speech (TTS) tasks.

A Graph Regularized Point Process Model For Event Propagation Sequence

no code implementations21 Nov 2022 Siqiao Xue, Xiaoming Shi, Hongyan Hao, Lintao Ma, Shiyu Wang, Shijun Wang, James Zhang

Point process is the dominant paradigm for modeling event sequences occurring at irregular intervals.

A Meta Reinforcement Learning Approach for Predictive Autoscaling in the Cloud

1 code implementation31 May 2022 Siqiao Xue, Chao Qu, Xiaoming Shi, Cong Liao, Shiyi Zhu, Xiaoyu Tan, Lintao Ma, Shiyu Wang, Shijun Wang, Yun Hu, Lei Lei, Yangfei Zheng, Jianguo Li, James Zhang

Predictive autoscaling (autoscaling with workload forecasting) is an important mechanism that supports autonomous adjustment of computing resources in accordance with fluctuating workload demands in the Cloud.

Zero-shot Voice Conversion via Self-supervised Prosody Representation Learning

no code implementations27 Oct 2021 Shijun Wang, Dimche Kostadinov, Damian Borth

We then use the learned prosodic representations as conditional information to train and enhance our VC model for zero-shot conversion.

NoiseVC: Towards High Quality Zero-Shot Voice Conversion

no code implementations13 Apr 2021 Shijun Wang, Damian Borth

Voice conversion (VC) is a task that transforms voice from target audio to source without losing linguistic contents, it is challenging especially when source and target speakers are unseen during training (zero-shot VC).

Neural Physicist: Learning Physical Dynamics from Image Sequences

no code implementations9 Jun 2020 Baocheng Zhu, Shijun Wang, James Zhang

In this paper, by leveraging recent progresses in representation learning and state space models (SSMs), we propose NeurPhy, which uses variational auto-encoder (VAE) to extract underlying Markovian dynamic state at each time step, neural process (NP) to extract the global system parameters, and a non-linear non-recurrent stochastic state space model to learn the physical dynamic transition.

A Riemannian Primal-dual Algorithm Based on Proximal Operator and its Application in Metric Learning

no code implementations19 May 2020 Shijun Wang, Baocheng Zhu, Lintao Ma, Yuan Qi

In this paper, we consider optimizing a smooth, convex, lower semicontinuous function in Riemannian space with constraints.

Riemannian Proximal Policy Optimization

no code implementations19 May 2020 Shijun Wang, Baocheng Zhu, Chen Li, Mingzhe Wu, James Zhang, Wei Chu, Yuan Qi

In this paper, We propose a general Riemannian proximal optimization algorithm with guaranteed convergence to solve Markov decision process (MDP) problems.

2D View Aggregation for Lymph Node Detection Using a Shallow Hierarchy of Linear Classifiers

no code implementations14 Aug 2014 Ari Seff, Le Lu, Kevin M. Cherry, Holger Roth, Jiamin Liu, Shijun Wang, Joanne Hoffman, Evrim B. Turkbey, Ronald M. Summers

In this paper, we propose a new algorithm representation of decomposing the LN detection problem into a set of 2D object detection subtasks on sampled CT slices, largely alleviating the curse of dimensionality issue.

