Search Results for author: Yuguang Yang

Found 16 papers, 5 papers with code

PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System

no code implementations28 Sep 2023 Xiang Lyu, Yuhang Cao, Qing Wang, JingJing Yin, Yuguang Yang, Pengpeng Zou, Yanni Hu, Heng Lu

Speaker-attributed automatic speech recognition (SA-ASR) improves the accuracy and applicability of multi-speaker ASR systems in real-world scenarios by assigning speaker labels to transcribed texts.

Action Detection Activity Detection +3

PromptVC: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts

no code implementations17 Sep 2023 Jixun Yao, Yuguang Yang, Yi Lei, Ziqian Ning, Yanni Hu, Yu Pan, JingJing Yin, Hongbin Zhou, Heng Lu, Lei Xie

In this study, we propose PromptVC, a novel style voice conversion approach that employs a latent diffusion model to generate a style vector driven by natural language prompts.

Voice Conversion

MSAC: Multiple Speech Attribute Control Method for Reliable Speech Emotion Recognition

no code implementations8 Aug 2023 Yu Pan, Yuguang Yang, Yuheng Huang, Jixun Yao, JingJing Yin, Yanni Hu, Heng Lu, Lei Ma, Jianjun Zhao

Despite notable progress, speech emotion recognition (SER) remains challenging due to the intricate and ambiguous nature of speech emotion, particularly in wild world.

Attribute Cross-corpus +2

GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Accurate Speech Emotion Recognition

no code implementations13 Jun 2023 Yu Pan, Yanni Hu, Yuguang Yang, Wen Fei, Jixun Yao, Heng Lu, Lei Ma, Jianjun Zhao

Contrastive cross-modality pretraining has recently exhibited impressive success in diverse fields, whereas there is limited research on their merits in speech emotion recognition (SER).

Attribute Contrastive Learning +3

Decom--CAM: Tell Me What You See, In Details! Feature-Level Interpretation via Decomposition Class Activation Map

no code implementations27 May 2023 Yuguang Yang, Runtang Guo, Sheng Wu, Yimi Wang, Juan Zhang, Xuan Gong, Baochang Zhang

Although the Class Activation Map (CAM) is widely used to interpret deep model predictions by highlighting object location, it fails to provide insight into the salient features used by the model to make decisions.

Decision Making

Self-supervised Speaker Recognition Training Using Human-Machine Dialogues

no code implementations7 Feb 2022 Metehan Cekic, Ruirui Li, Zeya Chen, Yuguang Yang, Andreas Stolcke, Upamanyu Madhow

Speaker recognition, recognizing speaker identities based on voice alone, enables important downstream applications, such as personalization and authentication.

Contrastive Learning Speaker Recognition

ASR-Aware End-to-end Neural Diarization

no code implementations2 Feb 2022 Aparna Khare, Eunjung Han, Yuguang Yang, Andreas Stolcke

We present a Conformer-based end-to-end neural diarization (EEND) model that uses both acoustic input and features derived from an automatic speech recognition (ASR) model.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Fast query-by-example speech search using separable model

no code implementations18 Sep 2021 Yuguang Yang, Yu Pan, Xin Dong, Minqiang Xu

Second, we design a novel model inference scheme based on RepVGG which can efficiently improve the QbE search quality.

Word Embeddings

A Deep Reinforcement Learning Architecture for Multi-stage Optimal Control

no code implementations25 Nov 2019 Yuguang Yang

SDQL exploits the linear stage structure by approximating the Q function via a collection of deep Q sub-networks stacking along an axis marking the stage-wise progress of the whole task.

Q-Learning reinforcement-learning +1

Efficient Navigation of Colloidal Robots in an Unknown Environment via Deep Reinforcement Learning

no code implementations26 Jun 2019 Yuguang Yang, Michael A. Bevan, Bo Li

Equipping active colloidal robots with intelligence such that they can efficiently navigate in unknown complex environments could dramatically impact their use in emerging applications like precision surgery and targeted drug delivery.

Navigate Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.