Search Results for author: Xinyuan Qian

Found 10 papers, 5 papers with code

Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training

no code implementations • 1 Apr 2024 • Ruijie Tao, Xinyuan Qian, Rohan Kumar Das, Xiaoxue Gao, Jiadong Wang, Haizhou Li

Audio-visual active speaker detection (AV-ASD) aims to identify which visible face is speaking in a scene with one or more persons.

Audio-Visual Active Speaker Detection Denoising +1

Paper
Add Code

LocSelect: Target Speaker Localization with an Auditory Selective Hearing Mechanism

no code implementations • 16 Oct 2023 • Yu Chen, Xinyuan Qian, Zexu Pan, Kainan Chen, Haizhou Li

The prevailing noise-resistant and reverberation-resistant localization algorithms primarily emphasize separating and providing directional output for each speaker in multi-speaker scenarios, without association with the identity of speakers.

Paper
Add Code

InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition

no code implementations • 24 May 2023 • Zhi-Hao Lai, Tian-Hao Zhang, Qi Liu, Xinyuan Qian, Li-Fang Wei, Song-Lu Chen, Feng Chen, Xu-Cheng Yin

To address these issues, this paper proposes InterFormer for interactive local and global features fusion to learn a better representation for ASR.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding

no code implementations • 23 May 2023 • Tian-Hao Zhang, Hai-Bo Qin, Zhi-Hao Lai, Song-Lu Chen, Qi Liu, Feng Chen, Xinyuan Qian, Xu-Cheng Yin

The experimental results show that ASCD significantly improves the performance by leveraging both the acoustic and semantic information cooperatively.

speech-recognition Speech Recognition

Paper
Add Code

Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert

1 code implementation • CVPR 2023 • Jiadong Wang, Xinyuan Qian, Malu Zhang, Robby T. Tan, Haizhou Li

To address the problem, we propose using a lip-reading expert to improve the intelligibility of the generated lip regions by penalizing the incorrect generation results.

Contrastive Learning Lip Reading +1

352

Paper
Code

A Miniaturised Camera-based Multi-Modal Tactile Sensor

no code implementations • 6 Mar 2023 • Kaspar Althoefer, Yonggen Ling, Wanlin Li, Xinyuan Qian, Wang Wei Lee, Peng Qi

The human tactile system is composed of various types of mechanoreceptors, each able to perceive and process distinct information such as force, pressure, texture, etc.

Paper
Add Code

Iterative Sound Source Localization for Unknown Number of Sources

2 code implementations • 24 Jun 2022 • Yanjie Fu, Meng Ge, Haoran Yin, Xinyuan Qian, Longbiao Wang, Gaoyan Zhang, Jianwu Dang

Sound source localization aims to seek the direction of arrival (DOA) of all sound sources from the observed multi-channel audio.

Paper
Code

Speaker Extraction with Co-Speech Gestures Cue

1 code implementation • 31 Mar 2022 • Zexu Pan, Xinyuan Qian, Haizhou Li

Speaker extraction seeks to extract the clean speech of a target speaker from a multi-talker mixture speech.

Speech Separation

Paper
Code

Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection

4 code implementations • 14 Jul 2021 • Ruijie Tao, Zexu Pan, Rohan Kumar Das, Xinyuan Qian, Mike Zheng Shou, Haizhou Li

Active speaker detection (ASD) seeks to detect who is speaking in a visual scene of one or more speakers.

Audio-Visual Active Speaker Detection

254

Paper
Code

NUS-HLT Report for ActivityNet Challenge 2021 AVA (Speaker)

1 code implementation • The ActivityNet Large-Scale Activity Recognition Challenge Workshop, CVPR 2021 • Ruijie Tao, Zexu Pan, Rohan Kumar Das, Xinyuan Qian, Mike Zheng Shou, Haizhou Li

Active speaker detection (ASD) seeks to detect who is speaking in a visual scene of one or more speakers.

Ranked #9 on Audio-Visual Active Speaker Detection on AVA-ActiveSpeaker

Audio-Visual Active Speaker Detection

254

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.