Search Results for author: Jiadong Wang

Found 8 papers, 4 papers with code

MELT: Towards Automated Multimodal Emotion Data Annotation by Leveraging LLM Embedded Knowledge

1 code implementation30 May 2025 Xin Jing, Jiadong Wang, Iosif Tsangko, Andreas Triantafyllopoulos, Björn W. Schuller

We demonstrate the effectiveness of MELT by fine-tuning four self-supervised learning (SSL) backbones and assessing speech emotion recognition performance across emotion datasets.

Self-Supervised Learning Speech Emotion Recognition

The First MPDD Challenge: Multimodal Personality-aware Depression Detection

no code implementations15 May 2025 Changzeng Fu, Zelin Fu, Xinhe Kuang, Jiacheng Dong, Qi Zhang, Kaifeng Su, Yikai Su, Wenbo Shi, Junfeng Yao, Yuliang Zhao, Shiqi Zhao, Jiadong Wang, Siyang Song, Chaoran Liu, Yuichiro Yoshikawa, Björn Schuller, Hiroshi Ishiguro

The Multimodal Personality-aware Depression Detection (MPDD) Challenge aims to address this gap by incorporating multimodal data alongside individual difference factors.

Depression Detection

$C^2$AV-TSE: Context and Confidence-aware Audio Visual Target Speaker Extraction

no code implementations1 Apr 2025 Wenxuan Wu, Xueyuan Chen, Shuai Wang, Jiadong Wang, Lingwei Meng, Xixin Wu, Helen Meng, Haizhou Li

Audio-Visual Target Speaker Extraction (AV-TSE) aims to mimic the human ability to enhance auditory perception using visual cues.

Target Speaker Extraction

Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention

1 code implementation29 Apr 2024 Ruijie Tao, Xinyuan Qian, Yidi Jiang, Junjie Li, Jiadong Wang, Haizhou Li

To this end, we propose a novel selective auditory attention mechanism, which can suppress interference speakers and non-speech signals to avoid incorrect speaker extraction.

Target Speaker Extraction

Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert

1 code implementation CVPR 2023 Jiadong Wang, Xinyuan Qian, Malu Zhang, Robby T. Tan, Haizhou Li

To address the problem, we propose using a lip-reading expert to improve the intelligibility of the generated lip regions by penalizing the incorrect generation results.

Contrastive Learning Lip Reading +1

Rectified Linear Postsynaptic Potential Function for Backpropagation in Deep Spiking Neural Networks

no code implementations26 Mar 2020 Malu Zhang, Jiadong Wang, Burin Amornpaisannon, Zhixuan Zhang, VPK Miriyala, Ammar Belatreche, Hong Qu, Jibin Wu, Yansong Chua, Trevor E. Carlson, Haizhou Li

In STDBP algorithm, the timing of individual spikes is used to convey information (temporal coding), and learning (back-propagation) is performed based on spike timing in an event-driven manner.

Decision Making

Cannot find the paper you are looking for? You can Submit a new open access paper.