Search Results for author: Jianwu Dang

Found 17 papers, 3 papers with code

MIMO-DoAnet: Multi-channel Input and Multiple Outputs DoA Network with Unknown Number of Sound Sources

1 code implementation15 Jul 2022 Haoran Yin, Meng Ge, Yanjie Fu, Gaoyan Zhang, Longbiao Wang, Lei Zhang, Lin Qiu, Jianwu Dang

These algorithms are usually achieved by mapping the multi-channel audio input to the single output (i. e. overall spatial pseudo-spectrum (SPS) of all sources), that is called MISO.

Iterative Sound Source Localization for Unknown Number of Sources

no code implementations24 Jun 2022 Yanjie Fu, Meng Ge, Haoran Yin, Xinyuan Qian, Longbiao Wang, Gaoyan Zhang, Jianwu Dang

Sound source localization aims to seek the direction of arrival (DOA) of all sound sources from the observed multi-channel audio.

Heterogeneous Graph Neural Networks using Self-supervised Reciprocally Contrastive Learning

no code implementations30 Apr 2022 Di Jin, Cuiying Huo, Jianwu Dang, Peican Zhu, Weixiong Zhang, Witold Pedrycz, Lingfei Wu

However, the existing contrastive learning methods are inadequate for heterogeneous graphs because they construct contrastive views only based on data perturbation or pre-defined structural properties (e. g., meta-path) in graph data while ignore the noises that may exist in both node attributes and graph topologies.

Contrastive Learning

TMS: A Temporal Multi-scale Backbone Design for Speaker Embedding

no code implementations17 Mar 2022 Ruiteng Zhang, Jianguo Wei, Xugang Lu, Wenhuan Lu, Di Jin, Junhai Xu, Lin Zhang, Yantao Ji, Jianwu Dang

Therefore, in the most current state-of-the-art network architectures, only a few branches corresponding to a limited number of temporal scales could be designed for speaker embeddings.

Speaker Verification

L-SpEx: Localized Target Speaker Extraction

1 code implementation21 Feb 2022 Meng Ge, Chenglin Xu, Longbiao Wang, Eng Siong Chng, Jianwu Dang, Haizhou Li

Speaker extraction aims to extract the target speaker's voice from a multi-talker speech mixture given an auxiliary reference utterance.

Using multiple reference audios and style embedding constraints for speech synthesis

no code implementations9 Oct 2021 Cheng Gong, Longbiao Wang, ZhenHua Ling, Ju Zhang, Jianwu Dang

The end-to-end speech synthesis model can directly take an utterance as reference audio, and generate speech from the text with prosody and speaker characteristics similar to the reference audio.

Sentence Similarity Speech Synthesis

Exploring Deep Learning for Joint Audio-Visual Lip Biometrics

1 code implementation17 Apr 2021 Meng Liu, Longbiao Wang, Kong Aik Lee, Hanyi Zhang, Chang Zeng, Jianwu Dang

Audio-visual (AV) lip biometrics is a promising authentication technique that leverages the benefits of both the audio and visual modalities in speech communication.

Speaker Recognition

SpEx+: A Complete Time Domain Speaker Extraction Network

no code implementations10 May 2020 Meng Ge, Cheng-Lin Xu, Longbiao Wang, Eng Siong Chng, Jianwu Dang, Haizhou Li

To eliminate such mismatch, we propose a complete time-domain speaker extraction solution, that is called SpEx+.

Audio and Speech Processing Sound

Towards Efficient Processing and Learning with Spikes: New Approaches for Multi-Spike Learning

no code implementations2 May 2020 Qiang Yu, Shenglan Li, Huajin Tang, Longbiao Wang, Jianwu Dang, Kay Chen Tan

They are also believed to play an essential role in low-power consumption of the biological systems, whose efficiency attracts increasing attentions to the field of neuromorphic computing.

Relation Modeling with Graph Convolutional Networks for Facial Action Unit Detection

no code implementations23 Oct 2019 Zhilei Liu, Jiahui Dong, Cuicui Zhang, Longbiao Wang, Jianwu Dang

Most existing AU detection works considering AU relationships are relying on probabilistic graphical models with manually extracted features.

Action Unit Detection Facial Action Unit Detection

Robust Environmental Sound Recognition with Sparse Key-point Encoding and Efficient Multi-spike Learning

no code implementations4 Feb 2019 Qiang Yu, Yanli Yao, Longbiao Wang, Huajin Tang, Jianwu Dang, Kay Chen Tan

Our framework is a unifying system with a consistent integration of three major functional parts which are sparse encoding, efficient learning and robust readout.

Decision Making

Implicit Discourse Relation Recognition using Neural Tensor Network with Interactive Attention and Sparse Learning

no code implementations COLING 2018 Fengyu Guo, Ruifang He, Di Jin, Jianwu Dang, Longbiao Wang, Xiangang Li

In this paper, we propose a novel neural Tensor network framework with Interactive Attention and Sparse Learning (TIASL) for implicit discourse relation recognition.

Sparse Learning Text Summarization

Speech Emotion Recognition Considering Local Dynamic Features

no code implementations21 Mar 2018 Haotian Guan, Zhilei Liu, Longbiao Wang, Jianwu Dang, Ruiguo Yu

Recently, increasing attention has been directed to the study of the speech emotion recognition, in which global acoustic features of an utterance are mostly used to eliminate the content differences.

Speech Emotion Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.